Lab 0 2015

From 6.034 Wiki

(Difference between revisions)

Revision as of 05:08, 6 September 2006

__NUMBERS__

The purpose of this problem set is to familiarize you with this term's problem set system and to serve as a diagnostic for programming ability and facility with MIT Scheme. 6.034 uses MIT Scheme for all of its problem sets and you will be called on to understand the functioning of large systems, as well as to write significant pieces of code yourself.

While coding is not, in itself, a focus of this class, artificial intelligence is a hard subject full of subtleties. As such, it is important that you be able to focus on the problems you are solving, rather than the mechanical code necessary to implement the solution. If you struggle with the problems in this problem set, you will likely struggle with all subsequent problem sets as well, and 6.034 will be very difficult for you.

If Scheme doesn't come back to you by the end of this problem set, we recommend that you seek extra help through the Course 6/HKN tutoring program, which matches students who want help with students who've taken and done well in a class. The department pays the tutor, and the program comes highly recommended.

Scheme resources

Some resources to help you knock the rust off of your Scheme:

Your trusty 6.001 book, Structure and Interpretation of Computer Programming, available online
The standard Scheme documentation, which is called "R5RS"
Course 6/HKN tutoring program

Problem set logistics

The first thing you need to do is set up a 6.034 section in your Athena files. From any Athena terminal, run these commands:

add 6.034
/mit/6.034/bin/6.034-setup

This script creates a 6.034-psets directory in your Athena home directory, with a subdirectory for each problem set, and sets the permissions on them so that your TA can access the files in it. Changing these permissions may affect your grade.

In order to get credit for a problem set, you must copy the files containing any code you wrote into the appropriate directory. This is how you submit your solutions.

DrScheme

There are many versions of Scheme out there. 6.001 used different ones in different places. In order for the code in this class to work, though, we need to standardize on one version of Scheme.

This class will be done entirely using the DrScheme environment. You may have used DrScheme to do your projects in 6.001. In fact, DrScheme supports several dialects of the Scheme language, so the one you should pick is called "PLT Textual".

You should not use 6.001 Scheme or MIT Scheme to do the problem sets! If you do, you will get strange error messages, and the TAs will not be able to help you.

You run DrScheme on Athena like this:

add drscheme
drscheme &

You can also download it to run it on your own computer.

Getting the problem set code

If you are working on Athena: The code for the problem sets is in the 6.034 locker. You can get problem set 0 like this: cp /mit/6.034/psets-f06/ps0/* ~/6.034-psets/ps0/; Then, you can edit the code in your ~/6.034-psets/ps0 directory. That way, you won't need to do anything to submit the code - it will already be in the right place.
If you are working on another computer with DrScheme: Create a folder for the problem set.; Download this file and extract it: http://web.mit.edu/6.034/psets-f06/ps0/ps0.zip

You can also view the code without downloading it: http://web.mit.edu/6.034/psets-f06/ps0/

Answering questions

The main file of this problem set is called ps0.scm. Open that file in DrScheme. The file contains a lot of (define) statements that you need to fill in with your solutions.

The first thing to fill in is a multiple choice question. The answer should be extremely easy. Many problem sets will begin with some simple multiple choice questions to make sure you're on the right track.

Run the tester

Every problem set comes with a file called tester.scm. This file checks your answers to the problem set. For problems that ask you to provide a function, the tester will test your function with several different inputs and see if the output is correct. For multiple choice questions, the tester will tell you if your answer was right. Yes, that means that you never need to submit wrong answers to multiple choice questions.

Open the file tester.scm in DrScheme and click "Run".
It should output the results of a lot of tests in your DrScheme window. You should pass one test (your answer to the multiple choice question), and fail the others, because you haven't solved those problems yet.

You should run the tester early and often, and definitely make sure you pass all the tests before you submit a problem set. Think of it as being like the "Check" button from 6.001. It makes sure you're not losing points unnecessarily.

Scheme programming

Now it's time to write some Scheme.

Warm-up stretch

Write the following functions:

(cube n), which takes in a number and returns its cube. For example, (cube 3) => 27.
(factorial n), which takes in a non-negative integer n and returns n!, which is the product of the integers from 1 to n. (0! = 1 by definition.)
(count-pattern pattern lst), which counts the number of times a certain pattern of symbols appears in a list, including overlaps. So (count-pattern '(a b) '(a b c e b a b f)) should return 2, and (count-pattern '(a b a) '(g a b a b a b a)) should return 3.

Expression depth

One way to measure the complexity of a mathematical expression is the depth of the expression describing it in Scheme. Write a program that finds the depth of an expression.

For example:

(depth 'x)} => 0
(depth '(expt x 2)) => 1
(depth '(+ (expt x 2) (expt y 2))) => 2
(depth '(/ (expt x 5) (expt (- (expt x 2) 1) (/ 5 2)))) => 4

Tree reference

Your job is to write a procedure that is analogous to list-ref, but for trees. This "tree-ref" procedure will take a tree and an index, and return the part of the tree (a leaf or a subtree) at that index. For trees, indices will have to be lists of integers. Consider the tree in Figure 1, represented by this Scheme list: (((1 2) 3) (4 (5 6)) 7 (8 9 10))

To select the element 9 out of it, we’d normally need to do something like (second (fourth tree)). Instead, we’d prefer to do (tree-ref tree (list 3 1)) (note that we’re using zero-based indexing, as in list-ref, and that the indices come in top-down order; so an index of (3 1) means you should take the fourth branch of the main tree, and then the second branch of that subtree). As another example, the element 6 could be selected by (tree-ref tree (list 1 1 1)).

Note that it’s okay for the result to be a subtree, rather than a leaf. So (tree-ref tree (list 0)) should return ((1 2) 3).

Matching

Throughout the semester, you will need to understand, manipulate, and extend complex algorithms implemented in Scheme. You may also want to write more functions than we provide in the skeleton file for a problem set.

In this problem, you will implement a simple pattern matching system, of the kind that is used in the rule-based expert systems you'll learn about next week.

The matcher compares the corresponding pieces of two lists (which may have other lists inside them -- that is, they may be trees). In the first list, called the pattern, some of these pieces may be variables -- they can match any value as long as they do so consistently. The other list has no variables, and is called the datum.

When a variable is matched to a value, that value needs to be stored to make sure that later instances of that variable match the same value. These connections between variables and values are stored in a structure called the bindings, and variables that have been matched are said to be bound.

The matcher is outlined in the match.scm file. The parts that are left for you to write are:

Matching variables against their corresponding values, and adding the appropriate bindings.
Matching entire lists by ensuring that their corresponding pieces match

The code for this matcher is in match.scm; you will finishing it by completing the two functions at the end of the file, match-variable and match-lists.

The code starts by defining some abstractions:

A variable expression indicates that a certain part of the pattern is a variable. It's represented by a two element list whose first element is the symbol ? and whose second element is the variable's name. So the variable x has the expression (? x).
A binding gives a value for a variable. It's represented by a two-element list, whose first element is the variable's name (not the whole expression) and whose second element is the value. So if the variable x gets bound to the list (3 4), then the binding that says so looks like this: (x (3 4))
A set of bindings stores multiple bindings, associating multiple variables with their values. When a match succeeds, it returns the set of bindings that makes it work. A set of bindings has the symbol bindings as its first element, and the rest of the list consists of the bindings themselves.

An example of a set of bindings looks like this: (bindings (x (3 4)) (y b) (z 3)). The variables are named x, y, and z, and their respective values are (3 4), the symbol b, and the number 3.

Some examples of matching:

(match '(a (? x) c) '(a b c)) → (bindings (x b))
(match '(a ((? x) c) d) '(a (b c) d)) → (bindings (x b))
(match '(a ((? x) c) (? y)) '(a (b c) c)) → (bindings (y c) (x b))
(match '(a (b c) d) '(a (b c) d)) → (bindings)

If a variable with the same name occurs more than once in a pattern, it must match equal parts of the data:

(match '(a (? x) c (? x) e) '(a b c b e)) → (bindings (x b))
(match '(a (? x) c (? x) e) '(a b c d e)) → ()
(match '(a (? x) c (? y) e) '(a b c d e)) → (bindings (y d) (x b))

The code also supports nameless variables (? _), which simply match but whose matching values are not returned and not constrained to be equal:

(match '(a (? _) c (? _) e) '(a b c b e)) → (bindings)
(match '(a (? _) c (? _) e) '(a b c d e)) → (bindings)

A variable can match any component of the data list, but it will not match a sublist of the data:

(match '(a (? x) d) ’(a b c d)) → #f

The tester will make sure that your code matches these test cases and some other ones. The tests we run after you submit your problem set will be fairly similar. We don't intend to surprise you with "trick test cases", so don't worry.

Survey

We are always working to improve the class. Most problem sets will have at least one survey question at the end to help us with this. Your answers to these questions are purely informational, and will have no impact on your grade.

Please fill in the answers to the following questions in the definitions at the end of ps0.scm:

When did you take 6.001?
How many hours did 6.001 projects take you?
How well do you feel you learned the material in 6.001?
How many hours did this problem set take you?

When you're done

Remember to run the tester, and then to copy your code into your 6.034-psets/ps0 directory on Athena if you haven't already.

@@ Line 118: / Line 118: @@
 == Matching ==
-Throughout the semester, you will need to understand, manipulate, and extend complex algorithms imple-
+Throughout the semester, you will need to understand, manipulate, and extend complex algorithms implemented in Scheme. You may also want to write more functions than we provide in the skeleton file for a problem set.
-mented in Scheme. You may also want to write more functions than we provide in the skeleton file for a
-problem set.
-In this problem, you will work with a small matcher system. If you have trouble with this problem, you
+In this problem, you will implement a simple pattern matching system, of the kind that is used in the rule-based expert systems you'll learn about next week.
-will have a hard time learning important course material by means of problem sets.
-In this problem, you will work with the simple matcher in the <code>match.scm</code> file. You job will be to extend
-the matcher program with a new type of variable that can match sequences as well as single elements in a
-list.
-The matcher is given two lists as input, one (the pattern) contains variables and the other (the data)
+The matcher compares the corresponding pieces of two lists (which may have other lists inside them -- that is, they may be trees). In the first list, called the ''pattern'', some of these pieces may be ''variables'' -- they can match any value as long as they do so consistently. The other list has no variables, and is called the ''datum''.
-does not, and it returns a list of variable bindings that would make the pattern match the data. We denote
-simple variables by a list whose first element is the symbol <code>?</code> and whose second element is the name of the
-variable, e.g. <code>(? x)</code> represents the variable ''x''. For example:
-* (match '(a (? x) c) '(a b c)) &rarr; (bindings (x b))
+When a variable is matched to a value, that value needs to be stored to make sure that later instances of that variable match the same value. These connections between variables and values are stored in a structure called the ''bindings'', and variables that have been matched are said to be ''bound''.
+The matcher is outlined in the <code>match.scm</code> file. The parts that are left for you to write are:
+* Matching variables against their corresponding values, and adding the appropriate bindings.
+* Matching entire lists by ensuring that their corresponding pieces match
-Note that the return value is a list, whose car is the symbol bindings and whose cdr is a list of lists
+The code for this matcher is in <code>match.scm</code>; you will finishing it by completing the two functions at the end of the file, <code>match-variable</code> and <code>match-lists</code>.
-(possibly null). The first element of the component lists is the name of a variable and the second element of
-these lists is the matched value. Here are some more examples of successful matches:
+The code starts by defining some abstractions:
+* A '''variable expression''' indicates that a certain part of the pattern is a variable. It's represented by a two element list whose first element is the symbol <code>?</code> and whose second element is the variable's name. So the variable ''x'' has the expression <code>(? x)</code>.
+* A '''binding''' gives a value for a variable. It's represented by a two-element list, whose first element is the variable's name (not the whole expression) and whose second element is the value. So if the variable ''x'' gets bound to the list <code>(3 4)</code>, then the binding that says so looks like this: <code>(x (3 4))</code>
+* A '''set of bindings''' stores multiple bindings, associating multiple variables with their values. When a match succeeds, it returns the set of bindings that makes it work. A set of bindings has the symbol <code>bindings</code> as its first element, and the rest of the list consists of the bindings themselves.
+:An example of a set of bindings looks like this: <code>(bindings (x (3 4)) (y b) (z 3))</code>. The variables are named ''x'', ''y'', and ''z'', and their respective values are <code>(3 4)</code>, the symbol <code>b</code>, and the number <code>3</code>.
+Some examples of matching:
+* (match '(a (? x) c) '(a b c)) &rarr; (bindings (x b))
 * (match '(a ((? x) c) d) '(a (b c) d)) &rarr; (bindings (x b))
 * (match '(a ((? x) c) (? y)) '(a (b c) c)) &rarr; (bindings (y c) (x b))
@@ Line 152: / Line 155: @@
 The code also supports nameless variables <code>(? _)</code>, which simply match but whose matching values are
 not returned and not constrained to be equal:
-* (match '(a (? ) c (? ) e) '(a b c b e)) &rarr; (bindings)
+* (match '(a (? _) c (? _) e) '(a b c b e)) &rarr; (bindings)
-* (match '(a (? ) c (? ) e) '(a b c d e)) &rarr; (bindings)
+* (match '(a (? _) c (? _) e) '(a b c d e)) &rarr; (bindings)
-A simple variable can match any component of the data list, but it will not match a sublist of the data:
-* (match '(a (? x) d) ’(a b c d)) &rarr; ()
-What we want to do in this problem is extend the matcher to handle a new class of variable – called
-"segment variables" – that can match sublists of the data:
-* (match '(a (* x) d) '(a b c d)) &rarr; (bindings (x (b c)))
-* (match '(a (* x) d) '(a d)) &rarr; (bindings (x ()))
-* (match '((* x) a (* x) c) '(a c)) &rarr; (bindings (x ()))
-* (match '((* x) a (* x) c) '(b a b c)) &rarr; (bindings (x (b)))
-* (match '((* x) a (* x) c) '(b a d c)) &rarr; ()
-* (match '((a (* x)) d e) '((a b c) d e)) &rarr; (bindings (x (b c)))
-Note that the binding value of a segment variable is always a list, possibly a null list. Therefore,
-* (match '((? x) a (* x) c) '(b a b c)) &rarr; ()
-Because (* x) can not be bound to ’b, it must be bound to a list, such as ’(b).
-Note also that there is, in general, no unique assignment to the segment variables to produce a match.
-We just want to produce some assignment if one exists.
-Also, we treat the case of standalone segment variables that are not part of a list, that is <code>(match '(*
+A variable can match any component of the data list, but it will not match a sublist of the data:
-x) ...)</code>, as immediate failures since it is not clear what it should mean. On the other hand, <code>(match '(?
+* (match '(a (? x) d) ’(a b c d)) &rarr; #f
-x) ...)</code> always succeeds.
-Your job is to extend the matcher to support segment variables. Most of the work has already been
+The tester will make sure that your code matches these test cases and some other ones. The tests we run after you submit your problem set will be fairly similar. We don't intend to surprise you with "trick test cases", so don't worry.
-done for you. What remains is to read and understand how the matcher works, then implement the missing
-functionality needed for the match-segment-variable function.
 == Survey ==