Installing Cassandra on OS X Mavericks using Docker on Vagrant

Just a quick note – I’m in the process of publishing a book, Clojure Recipes. You can see it for presale on Amazon here.

Introduction

This post shows you how to setup a Cassandra Cluster from scratch on MacOS Mavericks using Docker running on Vagrant. This is non-trivial and if everything goes well, will probably take an hour.

Note that this are my notes from going through this process, so there will definitely be room for improvement, either a simpler way to do things or a better way to explain what is going on.

Assumptions

  • This has been written for a Macbook Pro Retina running MacOS X 10.9.3 (Mavericks) with 16GB of RAM. You’ll need to make a judgment as to whether your setup is similar enough for these steps to apply to your system.

Outline

This has the following steps:

A. Install Homebrew
B. Install brew-cask
C. Install Vagrant
D. Set up a Vagrant instance running Ubuntu
E. Install Docker on Vagrant
F. Clone docker-cassandra on Vagrant
G. Customise the docker-cassandra instance
H. Start the Cassandra cluster
I. Test from your mac
J. Installing Cassandra on the mac

 

Process Steps

A. Install Homebrew

1. Run the following command to install Homebrew if you don’t already have it installed:
$ ruby -e "$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)"

B. Install brew-cask

2. Run the following commands to install brew-cask
$ brew tap caskroom/cask
$ brew install brew-cask

C. Install vagrant

3. Run the following command to install vagrant:
brew cask install vagrant

D. Set up a Vagrant instance running Ubuntu

4. To select a Ubuntu Image for Vagrant – run the following:
vagrant box add ubuntu-server-14.04 https://oss-binaries.phusionpassenger.com/vagrant/boxes/latest/ubuntu-14.04-amd64-vbox.box

vagrant init ubuntu-server-14.04

5. To configure networking for Vagrant – modify the file in /home/users//VagrantFile to add at the end (after the End):
Vagrant::Config.run do |config|
config.vm.forward_port 9160, 9160
config.vm.forward_port 9042, 9042
end

(Note that docker-cassandra only supports forwarded ports – it doesn’t support vagrant bridges)

6. Start the Vagrant Ubuntu instance:
vagrant up

7. Test the vagrant instance by connecting:
vagrant ssh
You are now shelled into the vagrant terminal.

E. Install Docker on Vagrant

8. Run the following to install docker
sudo apt-get update
sudo apt-get install docker.io
sudo ln -sf /usr/bin/docker.io /usr/local/bin/docker
sudo sed -i '$acomplete -F _docker docker' /etc/bash_completion.d/docker.io

9. Test docker works with
sudo docker run -i -t ubuntu /bin/bash
You are now shelled into the docker terminal

10. Exit from docker to vagrant
exit

F. Clone docker-cassandra on Vagrant

11. Install git on the vagrant instance (Vagrant terminal)
apt-get install git

12. Install nano on the vagrant instance (Vagrant terminal)
apt-get install nano

13. Clone docker-cassandra:
git clone https://github.com/nicolasff/docker-cassandra.git

G. Customise the docker-cassandra instance

14. Modify the docker install scripts to allow external password authentication
nano docker-cassandra/install/bin/install-cassandra

15. Add the lines
sed -i -e 's/authenticator: AllowAllAuthenticator/authenticator: PasswordAuthenticator/g' $CONFIG
sed -i -e 's/authorizer: AllowAllAuthorizer/authorizer: CassandraAuthorizer/g' $CONFIG

H. Start the Cassandra cluster

16. Create the docker image (vagrant terminal)
make image VERSION=2.0.3

17. Confirm the image has been created with
./list-images.sh
You should get
2.0.3

18. Start 3 nodes on the cluster with: (vagrant terminal)
./start-cluster.sh 2.0.3 3

19. List the nodes running: (vagrant terminal)
sudo docker ps
You should get a result similar to:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1b8062746694 cassandra:2.0.3 /usr/bin/start-cassa 11 minutes ago Up 11 minutes 9042/tcp, 9160/tcp sleepy_darwin
365c4df8fc65 cassandra:2.0.3 /usr/bin/start-cassa 11 minutes ago Up 11 minutes 9042/tcp, 9160/tcp angry_fermat
c36f05d87f9b cassandra:2.0.3 /usr/bin/start-cassa 11 minutes ago Up 11 minutes 0.0.0.0:9042->9042/tcp, 0.0.0.0:9160->9160/tcp determined_franklin

20. Check the setup of the nodes:
./client.sh 2.0.3 nodetool -h cass1 status
You should get a result similar to:
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 192.168.100.3 125.64 KB 256 33.4% 07d65d03-9e95-402c-8ebc-aad77fc22aef rack1
UN 192.168.100.2 116.96 KB 256 34.3% 1e525775-1393-49af-8e95-c7527a6b0e73 rack1
UN 192.168.100.1 114.91 KB 256 32.2% 5c5f239c-60b0-4960-8a89-21a3aafb13d7 rack1

I. Test from your mac

21. On a mac terminal run:
$ telnet 192.168.1.17 9160
This should give:
Trying 192.168.1.17...
Connected to 192.168.1.17.
Escape character is '^]'.

Then control-C to escape
^CConnection closed by foreign host.
This showed it worked.

J. Installing Cassandra on the mac

22. Ensure you have Java installed by running:
java -version
This should give the result similar to:
java version "1.7.0_40"
Java(TM) SE Runtime Environment (build 1.7.0_40-b40)
Java HotSpot(TM) 64-Bit Server VM (build 24.0-b55, mixed mode)

If you don’t have Java installed – you can get it from here.

23. Download Cassandra from here to your Downloads directory.

24. From a terminal in the apache-cassandra-2.0.8 directory – run
csqlsh -u cassandra -p cassandra
You should get:
Connected to Test Cluster at localhost:9160.
[cqlsh 4.1.1 | Cassandra 2.0.3 | CQL spec 3.1.1 | Thrift protocol 19.38.0]
Use HELP for help.
cqlsh>

Congratulations – you have installed a Cassandra Cluster on your Mac using Vangrant and Docker.

References:

LambdaJam Brisbane 2014

I had the opportunity to run a workshop at LambdaJam Brisbane, and it was a blast. The conference is refreshing because it attracts all the really deep thinkers from the software development community in Australia, and it throws in a couple of big name international speakers as well.

They’ve just released the photos here. I’m sharing a few below.

LambdaJam Additional Photos

It was great to meet the guys from the Melbourne and Brisbane functional programming communities, and continue the conversations over dinner.

Hilights from the talks included:

My workshop was “Applying the paradigms of core.async in Clojure and ClojureScript”. The slides and code are available.

LambdaJam Photo 1

To the organisers – Dave Thomas and Craig Smith for organising a really fun conference.

LambdaJam Photo 2

Thanks to Liam McLennan and Craig Smith for the photos.

Just a quick note – I’m in the process of publishing a book, Clojure Recipes. You can see it for presale on Amazon here.

Steps to Setup a Cassandra Cluster on MacOS Mavericks using Virtualbox VMs

Just a quick note – I’m in the process of publishing a book, Clojure Recipes. You can see it for presale on Amazon here.

Introduction

This post shows you how to setup a Cassandra Cluster from scratch on MacOS Mavericks using Virtualbox VMs. This is non-trivial and if everything goes well, will probably take a couple of hours.

Note that this are my notes from going through this process, so there will definitely be room for improvement, either a simpler way to do things or a better way to explain what is going on.

Assumptions

  • This has been written for a Macbook Pro Retina running MacOS X 10.9.3 (Mavericks) with 16GB of RAM. You’ll need to make a judgment as to whether your setup is similar enough for these steps to apply to your system.
  • VirtualBox uses the language ‘guest’ to describe the client virtual machine running. This is not to be confused with the ‘guest login’ (which we won’t use anyway)

Pre-requisites

  •  you need at least 9G free on your hard drive
  •  you need at least 2.5G free RAM (running three virtual machines)

Outline

This has the following steps:

A. Downloads
B. Installing VirtualBox on a Mac
C. Getting Started with VirtualBox
D. Setting Up Ubuntu
E. Setting Up Java
F. Installing Cassandra
G. Run and Check the Cassandra Server
H. Duplicate the Virtual Machine for the Cluster
I. Setup Cluster Networking
J. Setting Up Cassandra

 

Process Steps

A. Downloads

1. Download VirtualBox – https://www.virtualbox.org/wiki/Downloads

2. Download The VirtualBox Extension Pack (Same URL as above)

3. Download a VirtualBox instance for Ubuntu 12.10

http://virtualboxes.org/images/ubuntu/#ubuntu1304

B. Installing VirtualBox on a Mac

4. Install VirtualBox (VirtualBox-4.3.6-91406-OSX.dmg)

5. Run VirtualBox to ensure it installed correctly, then exit the application.

6. Double-click the VirtualBox extension pack you downloaded earlier – follow the prompts to install the pack in Virtual Box

7. Expand the Ubuntu VirtualBox instance you downloaded earlier

C. Getting Started with VirtualBox

8. In VirtualBox start a new Machine with Machine -> Add and select the VirtualBox instance you downloaded earlier

9. Start the machine

10. If prompted for the Virtual Media Manager – remove the two mounted .iso virtual drives as you no longer have access

11. When logging in – use the password reverse for the ubuntu login, unless otherwise instructed at step 3

D. Setting up Ubuntu

12. Select the large cog in the bottom left hand corner for System Settings. Select Keyboard layout (not Keyboard). Use the plus button to add your keyboard, and the minus button to deselect Italian.

13. From the cog in the top right hand corner – select ‘about this computer’ – then wait for the ‘checking for updates’ button to change to ‘install updates’ – then install the updates. This may require a restart and re-login of the virtual machine

14. From the top left hand corner select Dash Home then Terminal

15. Update the keyboard from Italian to your local setting at the server level using the command

sudo dpkg-reconfigure keyboard-configuration

16. Install the virtual box guest additions on the ubuntu guest virtual instance

sudo apt-get install virtualbox-guest-additions

17. In your web browser, log onto the datomic site at http://my.datomic.com/ – request your keys under Account and download Datomic Pro from the Download tab. Also download the license key.

E. Setting Up Java

18. Install Java with the following commands:

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java7-installer

19. Check the installation with

java -version

(Ensure that you get Java 1.7 and not an OpenJDK variant)

20. Put JAVA_HOME into the environment using the following commands:

sudo nano /etc/environment
add the line:
JAVA_HOME=/usr/lib/jvm/java-7-oracle

21. Restart the machine, login and open a new terminal

sudo shutdown -r now

22. run the command

echo $JAVA_HOME

Ensure you get

/usr/lib/jvm/java-7-oracle

as a response.

F. Installing Cassandra

23. Locate the URL of the latest (post 2.0) install of Cassandra here http://cassandra.apache.org/download/

24. Create a Cassandra directory

sudo mkdir /home/cassandra

25. In a terminal window run

wget http://mirror.ventraip.net.au/apache/cassandra/2.0.3/apache-cassandra-2.0.3-bin.tar.gz
tar -xvzf apache-cassandra-2.0.3-bin.tar.gz
mv apache-casandra-2.0.3 /home/cassandra

26. Set up the folders:

sudo mkdir /var/lib/cassandra
sudo mkdir /var/log/cassandra
sudo chown -R $USER:$GROUP /var/lib/cassandra
sudo chown -R $USER:$GROUP /var/log/cassandra

27. Modify the environment

sudo nano /etc/environment
CASSANDRA_HOME=/home/cassandra/apache-cassandra-2.0.3

#add this to the end of the existing path variable – don’t add a new path entry.

PATH=…:$CASSANDRA_HOME/bin

28. Restart the virtual machine, login and start a terminal

sudo shutdown -r now

29. Check the settings with

echo %CASSANDRA_HOME

This should point to the directory just created.

G. Run and Check the Cassandra Server

30. Run

sudo sh /home/cassandra/apache-cassandra-2.0.3/bin/cassandra

31. Start a new terminal and run

sudo sh /home/cassandra/apache-cassandra-2.0.3/bin/cassandra-cli

Ensure that you get Connected to “Test Cluster” in the output

32. Exit the cli by running

exit;

33. In the other window, stop Cassandra by running

CTRL-C

If that doesn’t work – try this.

34. Clear the Cassandra Data

sudo rm -rf /var/lib/cassandra/*

35. Close and save the Virtual machine in VirtualBox

H. Duplicate the Virtual Machine for the Cluster

36. Right click on the virtual machine in the VirtualBox Manager and select Clone. Then ensure the checkbox Reinitialize the MAC address is checked. After this check Full Clone. You have now created the second node in the cluster.

37. Repeat this for the third node in the cluster.

I. Setup Cluster Networking

38. In Virtualbox select the menu VirtualBox -> Preferences -> Network -> Host Only Networks and select the plus button to create a new host-only network (take the default network name – probably vboxnet0)

39. Edit this setting and take note of of the values for
(a) Adapter-IPv4 Address
(b) DHCP Server – Server Address
(c) DHCP – Server Mask
(d) Lower address bound
(e) upper address bound

40. In the VirtualBox manager screen, control-click the second node and select Settings -> Network -> Adapter 1. Then change the drop down list value to Host Only Adapter. Then ensure that the name matches the network name set before (vboxnet0 for example).

41. In the terminal – run

ifconfig

and take note of the following values

(f) inet addr
(g) bcast
(h) mask

42. Now using terminal and nano – edit the /etc/network/interfaces file

sudo nano /etc/network/interfaces

and add the following:

iface  eth0 inet static
address 192.168.56.2

# i.e. below the dhcp range
netmask 255.255.255.0
network 192.168.56.1
broadcast 192.168.56.255
gateway 192.168.56.0

Ensure that you leave the entries for lo loopback

43. Now edit the file

sudo nano /etc/udev/rules.d/70-persistent-net.rules

to comment out all rows but the last one (and its comment) and change the eth# (eg eth1) to eth0 (at the end of the row)

44. Now restart the virtual machine:

sudo shutdown -r now

45. Confirm this has worked by ensuring your IP address you just set up comes up under eth0 when you run

ifconfig

If there is an eth1 entry – something has gone wrong – redo the steps above.

J. Setting Up Cassandra

46. Note the following Node Numbers

Node 0: 0
Node 1: 3074457345618258602
Node 2: 6148914691236517205

These were generated from here https://dl.dropboxusercontent.com/u/30184176/digitalocean/tokenCalc/index.html

47. On node 0 – edit the Cassandra settings file

nano /home/cassandra/apache-cassandra-2.0.3/conf/cassandra.yaml

and set the following settings:

cluster_name: ‘VirtualBox Cluster’

initial_token: 0

seed_provider:
- seeds:  "192.168.56.2"

listen_address: 192.168.56.2

rpc_address: 0.0.0.0

endpoint_snitch: RackInferringSnitch

48. On node 1 – edit the Cassandra settings file

nano /home/cassandra/apache-cassandra-2.0.3/conf/cassandra.yaml

and set the following settings:

cluster_name: ‘VirtualBox Cluster’

initial_token: 3074457345618258602

seed_provider:
- seeds:  "192.168.56.2"

listen_address: 192.168.56.3

rpc_address: 0.0.0.0

endpoint_snitch: RackInferringSnitch

49. On node 2 – edit the Cassandra settings file

nano /home/cassandra/apache-cassandra-2.0.3/conf/cassandra.yaml

and set the following settings:

cluster_name: ‘VirtualBox Cluster’

initial_token: 6148914691236517205

seed_provider:
- seeds:  "192.168.56.4"

listen_address: 192.168.56.4

rpc_address: 0.0.0.0

endpoint_snitch: RackInferringSnitch

50. Now start Cassandra on the first node (the seed node) using:

sudo sh /home/cassandra/apache-cassandra-2.0.3/bin/cassandra

(note that this will run as a background process – so if you hit enter after waiting for 30 seconds, you’ll get your command prompt back)

51. Now repeat this on each of the two secondary servers

sudo sh /home/cassandra/apache-cassandra-2.0.3/bin/cassandra

52. Fire up cqlsh and run a test:

/home/cassandra/apache-cassandra-2.0.3/bin/cqlsh -u cassandra -p cassandra

53. Run through the CQL commands here

References:

Steps to Setup Datomic on a Cassandra Cluster on MacOS Mavericks using Virtualbox

Just a quick note – I’m in the process of publishing a book, Clojure Recipes. You can see it for presale on Amazon here.

Introduction

This post shows you how to setup Datomic running on Cassandra using Virtualbox VMs from scratch. This is non-trivial and if everything goes well, will probably take a couple of hours.

Note that this are my notes from going through this process, so there will definitely be room for improvement, either a simpler way to do things or a better way to explain what is going on.

Assumptions

  • This has been written for a Macbook Pro Retina running MacOS X 10.9.3 (Mavericks) with 16GB of RAM. You’ll need to make a judgment as to whether your setup is similar enough for these steps to apply to your system.
  • VirtualBox uses the language ‘guest’ to describe the client virtual machine running. This is not to be confused with the ‘guest login’ (which we won’t use anyway)

Pre-requisites

  •  you need at least 9G free on your hard drive
  •  you need at least 2.5G free RAM (running three virtual machines)
  •  to actually do this – you need a datomic pro evaluation key – you can get this by registering on the my.datomic.com site – and then under Account click Send licence key and under Downloads – and also get the latest version of Datomic Pro.

Outline

This has the following steps:

Note that steps A-J are here.
K. Setting Up Datomic
L. Running Datomic

 

Process Steps

Note that steps 1-53 are here:

K. Setting Up Datomic:

54. Expand the datomic download

unzip datomic-pro-0.9.4384.zip -d datomic-pro-0.9.4384
cd datomic-pro-0.9.4384

55. Provision a keyspace and table for datomic from the datomic directory:

/home/cassandra/apache-cassandra-2.0.3/bin/cqlsh -f bin/cql/cassandra-keyspace.cql -u cassandra -p cassandra

56. Copy the cassandra transactor sample into the config directory

cd config/samples
cp cassandra-template.properties ..
cd ..

57. Modify the cassandra template to setup the properties of our local instance.

nano cassandra-template.properties
cd ..

58. Copy the licence key you downloaded (at step 16) into the file under the property entry

licence-key=

59. Set the following property entries

cassandra-host=localhost

60. Start the datomic transactor with  the new cassandra template properties we just created:

bin/transactor config/cassandra-template.properties

Ensure you get a result like:

System started datomic:case://localhost:9042/datomic.datomic/<DB-NAME>

Note this down – this will be the URI to connect.

L. Running Datomic

61. In a new command prompt in the datomic-pro directory run:

bin/shell

Ensure you get the prompt

Datomic Java Shell

62. Run through the following commands from the datomic tutorial: http://docs.datomic.com/tutorial.html

String uri = "datomic:cass://localhost:9042/datomic.datomic/seattle";
Peer.createDatabase(uri);
conn = Peer.connect(uri);

63. Load up the schema

schema_rdr = new FileReader("samples/seattle/seattle-schema.dtm");
schema_tx = Util.readAll(schema_rdr).get(0);
txResult = conn.transact(schema_tx).get();

64. Add some seed data

data_rdr = new FileReader("samples/seattle/seattle-data0.dtm");
data_tx = Util.readAll(data_rdr).get(0);
txResult = conn.transact(data_tx).get();

65. Query the database

db = conn.db();
for (Object result : results) {
entity = db.entity(result.get(0));
System.out.println(entity.get(":community/name"));
}

Voila – you have a Datomic database running on Cassandra across three nodes!

References:

Announcing Clojure Recipes

My book Clojure Recipes, is to be published for sale on October, 2015.

This is what the book is about:

Developers are discovering the immense power of Clojure’s functional programming model to quickly solve problems in domains ranging from social networking to Big Data. Targeting the Java Virtual Machine, Clojure also leverages the Java platform’s maturity and enormous ecosystem. Clojure Recipes is a “code recipe book” for this increasingly popular language.

Julian Gamble focuses on practical and complete examples that illuminate Clojure’s key features and show step-by-step how to solve real-world problems with it.  Clojure Recipes provides a series of “learn by doing” step-by-step projects, you’ll learn how to:

    • Write your own DSL
    • Build a website with Pedestal
    • Add ClojureScript to your website
    • Abstract boilerplate code into a macro
    • Get started with Storm
    • Build an application with Datomic
    • Build log readers, web app monitors, web testing suites, customized Ant tasks, and more

Here is a little about me:

Julian Gamble (Sydney, Australia) is a software engineer who has worked in the financial services industry for more than a decade. When he’s not enabling billions of dollars to orbit the globe, he writes and presents on all things software related.

FAQ

What about the cover?

We haven’t got that done yet. We’ll update it as soon as we know

 Isn’t there another book with a similar title?

We think it’s great that the Clojure community has grown beyond ‘Introduction to Clojure’ books. We’re both seeking to grow the Clojure community. They have a fantastic team working on it and we wish them all the best.

When did you start with this?

I signed a Contract last December with Pearson publishing. Since then I’ve had my head down getting it ready.

Lambdajam Brisbane

One of my friends described Lambdajam 2013 Brisbane as ‘One of the funnest conferences I’ve been on.’ I had a blast.

I  gave a Lightning Talk on Clojure.

Clojure Lightning Talk - Julian Gamble

Clojure Lightning Talk

The Little Schemer in Clojure – Chapter 10 – What is the value of all this? (A Simple Scheme evaluator in Clojure)

This is the tenth chapter of a series of posts about porting The Little Schemer to Clojure. You may wish to read the intro.

So far we’ve covered flat data structuresreading flat data structurescreating data structurescreating a numeric tower, working with multiple occurrences of a match in the list,  changing our functions to work with nested listsbuilding a numerical expression evaluatorperforming analysis on sets and numeric functions and deriving the Y-Combinator.

Now we’ll come to the pinnacle of the book (and some would say the peak of LISP itself). We’re building a metacircular interpreter. “What on earth is that?” you might ask. In trying to think of the most concise way to convey this idea – I came across this image:

This is another one that gets the same idea across. It might seem unimaginably crude, but it conveys the big idea of what this post is all about. LISP is a programmable programming language.

It is also the language with the most minimal syntax. If it has the most minimal syntax – then it must also be the easiest to implement. So one of the benefits of Lisp is that it is the easiest to implement (even it itself). That’s what we’ll do today.

But I hear you asking, “Um, practical use case? How would I explain to my manager that I need to eval my eval?”

This is the building block for a couple of different directions. First – if a language is easy to implement – then it is easy to build it into a transpiler/code generator. Clojure is a DSL for JVM bytecode, just as Clojurescript is a DSL for javascript. (One might argue that C is a DSL for ASM). Work is being done on a Clojure to X10C assembler, as well as work on Clojure to x86 generator.

On the business end of things – there comes a time when we need to build a parser for a client app. This might be for a calculator that reads an equation to build a graph or do a financial calculation. Or it might be one of the other seven scenarios Steve Yegge lists here. The point is – you need to write a program that reads programs (and even evaluates them).

Now please have some patience with what follows. This is a port from Scheme to Clojure. (This is 1974ish Scheme in a point in time with dinosaurs down the road and no built in map data structures in the language). This is trying to build everything the book has covered from the the ground up – so <sigh> we’re going to build a map data structure.

We’ll start with a key-value pair to put in our map.

(def new-entry build)

Then we’ll  build a function to get a value out of the map by key value. But first we’ll write a handler that gets passed in saying what the author wants to happen when the entry is not found.

(def lookup-in-entry-help
  (fn [name names values entry-f]
    (cond
      (empty? names) (entry-f name)
      (= (first names) name) (first values)
      true (lookup-in-entry-help
             name
             (rest names)
             (rest values)
             entry-f))))

We’d use it with a helper function together like this:

(def couldntfind
  (fn [a]
    (println "couldn't find" a)))

(println (lookup-in-entry-help 'entree '(mains dessert) '(chicken icecream) couldntfind))

Now we’ll lookup the value in a key-value entry:

(def lookup-in-entry 
  (fn [name entry entry-f]
    (lookup-in-entry-help
      name
      (first_ entry)
      (second_ entry)
      entry-f)))

Here is an illustration of how you’d use this:

(println (lookup-in-entry 'entree '((entree mains dessert) (garlicbread chicken icecream)) println))

Now we’ll add the ability to extend our table – which is a cons list:

(def extend-table cons)

Now we’ll lookup a value in the map

(def lookup-in-table
  (fn [name table table-f]
    (cond
      (null? table) (table-f name)
      true (lookup-in-entry
             name
             (first table)
             (fn [name]
               (lookup-in-table
                 name
                 (rest table)
                 table-f))))))

We can use it like this:

(println (lookup-in-table 'mains '( ((mains dessert) (steak gelato))) println))
;//=> steak

Ok – we’ve done our data structure. Now we’re going to start handling s-expressions. We’re going to handle six types:
* self-evaluating
* quote
* identifier
* lambda
* cond, and
* application

We’ll build a table to capture our environment:

(def initial-table
  (fn [name]
    (cond
      (= name 't) true
      (= name 'nil) nil
      true (build 'primitive name))))

The basic building blocks of lisp are functions and data. We’ll refer to our basic functions as identifiers, and our basic data atoms as self-evaluating. We’ll create some functions to capture this idea:

For functions we take the expression and go and lookup the handler in our table:

(def *identifier
  (fn [e table]
    (lookup-in-table
      e table initial-table)))

For data atoms – we just leave them as they are – but keep the function convention:

(def *self-evaluating
  (fn [e table]
    e))

For our data atoms – we’ll split these into numbers and strings – and build a handler to do that:

(def atom-to-action
  (fn [e]
    (cond
      (number? e) *self-evaluating
      true *identifier)))

Now we’ll look at the quote function. First we’ll build a function to get the value being quoted

(def text-of-quotation second_)

The only quirk to this definition is that we only take the first atom quoted and ignore the rest. You may consider this a bug in the book! We’ll stick to the book’s definition for now.

Then we’ll build our quote function handler which basically gets the value being quoted:

(def *quote
  (fn [e table]
    (text-of-quotation e)))

Now we start to get into some interesting territory. We’re going to handle lambda. We’ll do this by creating a new entry in the table marked as a non-primitive and storing the body (and the argument) of the lambda function in the entry value:

(def *lambda
  (fn [e table]
    (build 'non-primitive
           (cons table (rest e)))))

Giving this a try- we get:

(println (*lambda '(*lambda (b) (println b)) '()))
;//=>(non-primitive (() (b) (println b)))

We’re now going to look at cond (defined in the scheme way – not the Clojure way).

Now we’ll add some abstractions to get out the truth test and conditional expression to be evaluated from a line in a cond test:

(def question-of first_)

(def answer-of second_)

I’m sure you didn’t really need that – but the authors of the book seemed to think it would aid readability.

Now we’ll write some code to evaluate a line in a cond expression:

(def evcon
  (fn [lines table]
    (cond
      (meaning 
        (question-of (first lines)) table)
      (meaning 
        (answer-of (first lines)) table)
      true (evcon (rest lines) table))))

(We’ll define the function meaning in a moment. Yes there are a few circular references here. You can see on the code running page how they all hang together. )

Now we’ll get the body of a cond expression:

(def cond-lines rest)

Now we’ll write the code to evaluate a condition using all our build block functions:

(def *cond
  (fn [e table]
    (evcon (cond-lines e) table)))

Now we’ll build our s-expression function handler – based on the types we looked at before:

(def list-to-action
  (fn [e]
    (cond
      (atom? (first e)) (cond
                          (= (first e) 'quote) *quote
                          (= (first e) 'lambda) *lambda
                          (= (first e) 'cond) *cond
                          true *application)
      true *application)))

You’ll notice we excluded self-evaluating and identifier as these relate to atoms not functions. What we’ll build now is a function to do that step of filtering:

(def expression-to-action
  (fn [e]
    (cond
      (atom? e) (atom-to-action e)
      true (list-to-action e))))

This next function the book authors have assigned a profound meaning – but really it is just a combination of the function handler and the environment table. This is functions + data, not at an s-expression level, but at a program level. The beautiful thing about a LISP is that the s-expression level conceptually mirrors the program level. Let’s look at the function meaning:

(def meaning
  (fn [e table]
    ((expression-to-action e) e table)))

Now we need a bootstrap function to be able to use our meaning function. In LISPs – this is the eval function. The authors here have chosen to call it value. All it is doing is calling the meaning function, with an empty environment table:

(def value
  (fn [e]
    (meaning e '())))

Now we’ll do a handler for primitive identifiers:

(def apply-primitive
  (fn [name vals]
    (cond
      (= name 'car ) (first (first_ vals))
      (= name 'cdr ) (rest (first_ vals))
      (= name 'cons ) (cons (first_ vals) (second_ vals))
      (= name 'eq ) (= (first_ vals) (second_ vals))
      (= name 'atom? ) (atom? (first_ vals))
      (= name 'not ) (not (first_ vals))
      (= name 'null? ) (null? (first_ vals))
      (= name 'number? ) (number? (first_ vals))
      (= name 'zero? ) (zero? (first_ vals))
      (= name 'add1 ) (add1 (first_ vals))
      (= name 'sub1 ) (sub1 (first_ vals)))))

This maps the scheme primitive functions to corresponding clojure ones.

Now we’ll look at evaluating a list of items in the table:

(def evlis
  (fn [args table]
    (cond
      (null? args) '()
      true (cons (meaning (first args) table)
                 (evlis (rest args) table)))))

Now we’ll look at evaluating lambda functions as closures. We’ll start by getting a lambda expression identifier out of the table:

(def table-of first_)

Next we’ll get the lambda function arguments:

(def formals-of second_)

Next we’ll get the body of the lambda expression:

(def body-of third_)

For a given expression in a lambda body, we’ll extract the function from the s-expression:

(def function-of  first_)

Next we’ll get the body of the s-expression inside the lambda expression:

(def arguments-of rest)

Now we’ll test to see if an expression in the table has been tagged as a primitive:

(def primitive?
  (fn [l]
    (= (first_ l) 'primitive)))

Now we’ll see the opposite – if an expression in the table has been tagged as a non-primitive:

(def non-primitive?
  (fn [l]
    (= (first_ l) 'non-primitive)))

Now we’ll evaluate our lambda expression against the arguments passed in. Note that we can nest lambda expressions to make a function return a function:

(def apply-closure
  (fn [closure vals]
    (meaning (body-of closure)
             (extend-table
               (new-entry
                 (formals-of closure) vals)
               (table-of closure)))))

Now we’ll apply values to both primitive and non-primitive expressions:

(def apply_
  (fn [fun vals]
    (cond
      (primitive? fun) (apply-primitive (second_ fun) vals)
      (non-primitive? fun) (apply-closure (second_ fun) vals))))

Now we’ll step up a level and apply the results of one expression to return a function – to the results of another function to get the arguments to the first function:

(def *application
  (fn [e table]
    (apply_
      (meaning (function-of e) table)
      (evlis (arguments-of e) table))))

To approach self-hosting (we won’t get there – but it is the principle of the thing) – we’ll build our own cons function:

(def cons_
  (fn [u v]
    (fn [b]
      (cond
        b u
        true v))))

Note that instead of returning a list – we’re returning a function:

(println (cons_ 'apple '()))
;//=> #<Chapter10WhatIsTheValueOfAllThis$cons_$fn__862 Chapter10WhatIsTheValueOfAllThis$cons_$fn__862@272b72f4>

(def lunch (cons_ 'apple '()))

(println (lunch 'apple ))
;//=> 'apple
(println (lunch '1 ))
;//=> apple
(println (lunch nil ))
;//=> ()

Now we’ll define car and cdr to operate on this concept of function-chaining to represent lists:

(def car_
  (fn [l]
    (l true)))

(def cdr_
  (fn [l]
    (l nil)))

As an aside – this reliance on function chaining is almost the opposite of what we want. Instead we want to be able to represent chained function calls as data structures – to make the overhead of calls in recursion much cheaper. This representation makes it more expensive. Consider it a toy to stretch your brain and how you think about functions.

We’ve got to the end now – and are about to do our closing demos – and yet the book leaves out a critical piece which is disappointing. We want to be able to take all the functions in the book, and run them in our new evaluator – to fully close the conceptual loop. The authors however leave out define. We’re left to define everything with anonymous functions. We could do it ourselves, but that’s beyond the scope of this blog post.

We’ll finish by taking our evaluator for a spin.

(value '(add1 1))
 ;//=> 2
 
(value '(eq 2 1))
;//=> false

(value '(eq 1 1))
;//=> true
 
(value '(quote hello))
;//=> hello

(value '((lambda (x) 1) 2))
;//=> 1

(value '((lambda (x) x) 2))
;//=> 2

(value '((lambda (x) (add1 x)) 2))
;//=> 3

(value '(((lambda (y) (lambda (x) 1) y) 4)  3))
;//=> 1

(value '(((lambda (y) (lambda (x) x) y) 4)  3))
;//=> 3

(value '(((lambda (x y) (lambda (u) (cond (u x) (t y)))) 1 '()) nil))
;//=> ()

(value '((lambda (x) ((lambda (x) (add1 x)) (add1 4))) 6))
;//=> 6

You can see an example of this running here.

The Little Schemer in Clojure – Chapter 9 – Lambda The Ultimate (Deriving the Y-Combinator)

This is the ninth chapter of a series of posts about porting The Little Schemer to Clojure. You may wish to read the intro.

So far we’ve covered flat data structuresreading flat data structurescreating data structurescreating a numeric tower, working with multiple occurrences of a match in the list,  changing our functions to work with nested listsbuilding a numerical expression evaluator and performing analysis on sets and numeric functions.

This chapter is one of the high points of the whole book. You may have heard of Paul Graham’s VC fund, YCombinator – in this session we’ll not only explain the programming concept of the YCombinator – we’ll derive it in Clojure.

Ok – so before we get started – what is it? What is this YCombinator concept everyone is so excited about? Here goes. The YCombinator is a way to do recursion in a language that doesn’t support recursion. Instead recursion is defined as a set of rewrite rules.

“What?” I hear you saying. “All the languages I use have recursion, is this just some old hack that people used in the 60’s before they had real languages? Why should I get excited about that?” Fair enough. On it’s own its utility is limited, but it becomes a conceptual building block for lazy functional data structures – which are one of the things that people get excited about in Clojure – and something we don’t see in curly brace languages.

So anyway – let’s get started.

We’ll start with the rember-f function. This takes a list of items, an addition atom and a comparison function. The point is to demonstrate passing functions around.

;demonstrating passing functions around
(def rember-f
  (fn [test? a l]
    (cond
      (null? l) '()
      true (cond
             (test? (first l) a) (rest l)
             true (cons (first l) (rember-f test? a (rest l)))))))

(println (rember-f = '(pop corn) '(lemonade (pop corn) and (cake))))
;//=>(lemonade and (cake))

So we passed in the equals function (=) and it was used to test equality with the other item passed in.

Now we’ll rewrite the rember-f function to remove the second conditional:

(def rember-f
  (fn [test? a l]
    (cond
      (null? l) '()
      (test? (first l) a) (rest l)
      true (cons (first l) (rember-f test? a (rest l))))))

(println (rember-f = '(pop corn) '(lemonade (pop corn) and (cake))))
;//=>(lemonade and (cake))

Now we’re going to start looking at writing functions to be curried. By currying we’re referring to the Mathematician Haskell Curry – for whom this is named. The idea of currying a function is that you can pass in fewer than all the functions required for the function to execute, and still pass the result around as a function (now with fewer arguments).

(def eq?-c
  (fn [a]
    (fn [x]
      (= x a))))

(println (eq?-c 'lemonade))
=> (println (eq?-c 'lemonade))
#<Chapter9LambdaTheUltimate$eq_QMARK__c$fn__974 Chapter9LambdaTheUltimate$eq_QMARK__c$fn__974@2a2a2ae9>
(println ((eq?-c 'lemonade) 'coke))
;//=> false
(println ((eq?-c 'lemonade) 'lemonade))
;//=> true

(def eq?-salad (eq?-c 'salad))

(println (eq?-salad 'tuna))
;//=>false
(println (eq?-salad 'salad))
;//=>true

Ok – taking that principle to the next level – can we curry our rember-f function? Here goes:

(def rember-f
  (fn [test?]
    (fn [a l]
      (cond
        (null? l) '()
        (test? (first l) a) (rest l)
        true (cons (first l) ((rember-f test?) a (rest l)))))))

(println ((rember-f =) 'tuna '(tuna salad is good)))
;//=>(salad is good)

(def rember-eq? (rember-f =))
(println (rember-eq? 'tuna '(tuna salad is good)))
;//=>(salad is good)

That was so much fun, we’ll do it again with something similar, the insertL-f function.

(def insertL-f
  (fn [test?]
    (fn [new old l]
      (cond
        (null? l) '()
        (test? (first l) old) (cons new (cons old (rest l)))
        true (cons (first l) ((insertL-f test?) new old (rest l)))))))

(println ((insertL-f =) 'creamy 'latte '(a hot cup of latte)))

One more time! Let’s look at insertR-f:

(def insertR-f
  (fn [test?]
    (fn [new old l]
      (cond
        (null? l) '()
        (test? (first l) old) (cons old (cons new (rest l)))
        true (cons (first l) ((insertR-f test?) new old (rest l)))))))

(println ((insertR-f =) 'cake 'cheese '(new york cheese)))

Ok – so you’ve probably noticed that insertR and insertL are quite similar except for a piece of code in the middle. Now we’re going to try and replace these two functions with a single function insert-g, that gets another function argument passed in for the difference. We’ll call the functions we insert seqL and seqR. Here goes:

(def seqL
  (fn [new old l]
    (cons new (cons old l))))

(def seqR
  (fn [new old l]
    (cons old (cons new l))))

(def insert-g
  (fn [seqarg]
    (fn [new old l]
      (cond
        (null? l) '()
        (= (first l) old) (seqarg new old (rest l))
        true (cons (first l) ((insert-g seqarg) new old (rest l)))))))

(def insertL (insert-g seqL))
(println (insertL 'creamy 'latte '(a hot cup of latte)))
;//=>(a hot cup of creamy latte)
(def insertR (insert-g seqR))
(println (insertR 'cake 'cheese '(new york cheese)))
;//=>(new york cheese cake)

Ok – what if we do that again – but don’t pass in seqL as a named function – just implement it in the definition of insertL?

(def insertL
  (insert-g
    (fn [new old l]
      (cons new (cons old l)))))
(println (insertL 'creamy 'latte '(a hot cup of latte)))
;//=>(a hot cup of creamy latte)

You can argue that this is better as we don’t have to remember as many function names.

Now we’ll look at the subst function from Chapter 3.

(def subst
  (fn [new old l]
    (cond
      (null? l) '()
      (= (first l) old) (cons new (rest l))
      true (cons (first l) (subst new old (rest l))))))

(println (subst 'espresso 'latte '(a hot cup of latte)))
;//=>(a hot cup of espresso)

So what we notice is that subst is quite close to insertL and insertR. So now we can write a new seq function for insert-g. Then we can plug it into a new definition of subst:

(def seqS
  (fn [new old l]
    (cons new l)))

(def subst (insert-g seqS))

(println (subst 'espresso 'latte '(a hot cup of latte)))
;//>(a hot cup of espresso)

Cool – now we can use this technique on the rember function:

(def seqrem
  (fn [new old l]
    l))

(def rember
  (fn [a l]
    ((insert-g seqrem) nil a l)))

(println (rember 'hot '(a hot cup of espresso)))
;//=>(a cup of espresso)

It’s time for another Commandment. The Tenth Commandment states that you should “Abstract functions with common structures into a single function“. This is kind of like extract method – except that you pass the method in as a closure.

Now we’ll go back to the value function from Chapter 7.

(def number_?
  (fn [a]
    (cond
      (null? a) false
      (number? a) true
      true false)))

(def first-sub-exp
  (fn [aexp]
    (first (rest aexp))))

(def second-sub-exp
  (fn [aexp]
    (first (rest (rest aexp)))))

(def operator
  (fn [aexp]
    (first aexp)))

(use 'clojure.math.numeric-tower)

(def value
  (fn [aexp]
    (cond
      (number_? aexp) aexp
      (= (operator aexp) '+) (+ (value (first-sub-exp aexp)) (value  (second-sub-exp aexp)))
      (= (operator aexp) '*) (* (value (first-sub-exp aexp)) (value (second-sub-exp aexp)))
      (= (operator aexp) 'exp) (expt (value (first-sub-exp aexp)) (value (second-sub-exp aexp))))))

(println (value '(+ 1 1)))
;//=>2

Now we’ll simplify this down by using a function to take a symbol and return a function:

(def atom-to-function
  (fn [x]
    (cond
      (= x '+ ) +
      (= x '* ) *
      (= x 'exp ) expt )))

(def value
  (fn [aexp]
    (cond
      (number_? aexp) aexp
      true ((atom-to-function (operator aexp))
             (value (first-sub-exp aexp))
             (value (second-sub-exp aexp))))))

(println (value '(+ 1 1)))
;//=> 2

So that was a bit shorter.

Now we’ll look at subset? and intersect? from Chapter 8.

(def member?
  (fn [a lat]
    (cond
      (null? lat) false
      true (or
        (= (first lat) a)
        (member? a (rest lat)))) ))

(def subset?
  (fn [set1 set2]
    (cond
      (null? set1) true
      true (and
             (member? (first set1) set2)
             (subset? (rest set1) set2)))))
(println (subset? '(a b c) '(b c d)))
;//=>false
(println (subset? '(b c) '(b c d)))
;//=>true

(def intersect?
  (fn [set1 set2]
    (cond
      (null? set1) false
      true (or
             (member? (first set1) set2)
             (intersect? (rest set1) set2)))))

(println (intersect? '(a b c) '(b c d)))
;//=>true

What we notice is that these two functions only differ in their use of and true and or nil. We can try and abstract them as we’ve done before.

(def set-f?
  (fn [logical? const]
    (fn [set1 set2]
      (cond
        (null? set1) const
        true (logical?
               (member? (first set1) set2)
               ((set-f? logical? const) (rest set1) set2))))))

;(def subset? (set-f? and true))
;(def intersect? (set-f? or nil))
; note - doesn't work yet

But this doesn’t work. (This is where a light bulb comes on and we start learning). We need to redefine the and and or functions for the functions we’re using:

(def and-prime
  (fn [x y]
    (and x y)))

(def or-prime
  (fn [x y]
    (or x y)))
; still doesn't work

(def or-prime
  (fn [x set1 set2]
    (or x (intersect? (rest set1) set2))))

(def and-prime
  (fn [x set1 set2]
    (and x (subset? (rest set1) set2))))

(def set-f?
  (fn [logical? const]
    (fn [set1 set2]
      (cond
        (null? set1) const
        true (logical?
               (member? (first set1) set2)
               set1 set2)))))

(def intersect? (set-f? or-prime false))
(def subset? (set-f? and-prime true))

(println (intersect?  '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>true
(println (subset? '(banana butter) '(breakfast toasted banana bread with butter for breakfast)))
;//=>true

The tricky thing there was the recursive use and definition of intersect? ie assuming we could use a function before we had defined it in one function, and then using that function to define insersect? – somewhat brain bending!

You can see we wrote set-f to accept and-prime and or-prime as functions passed in as arguments.

Ok – so we’re all warmed up with passing functions around as arguments. Let’s get on with this Y Combinator derivation. We’ll start with multirember from Chapter 5. We’ll simplify it to remove the redundant cond:

(def multirember
  (fn [a lat]
    (cond
      (null? lat) '()
      (= (first lat) a) (multirember a (rest lat))
      true (cons (first lat) (multirember a (rest lat))))))

(println (multirember 'breakfast '(breakfast toasted banana bread with butter for breakfast)))
;//=>(toasted banana bread with butter for)

Too easy. Now we’ll curry this function to return a function that removes a particular list member:

(def mrember-curry
  (fn [l]
    (multirember 'curry l)))

(println (mrember-curry '(curry chicken with curry rice)))
;//=>(chicken with rice)

Again we’ve done this before – not too hard. Now we’ll rewrite mrember-curry in full without currying it:

(def mrember-curry
  (fn [l]
    (cond
      (null? l) '()
      (= (first l) 'curry) (mrember-curry (rest l))
      true (cons (first l) (mrember-curry (rest l))))))

(println (mrember-curry '(curry chicken with curry rice)))
;//=>(chicken with rice)

Now let’s curry the function above again with an argument that the curried function can be applied to:

(def curry-maker
  (fn [future]
    (fn [l]
      (cond
        (null? l) '()
        (= (first l) 'curry) ((curry-maker future) (rest l))
        true (cons (first l) ((curry-maker future) (rest l)))))))

(def mrember-curry (curry-maker 0))
;//=>(chicken with rice)

Ok so why did we bother with that one? We’ll its like how we replaced insertL with insert-g – except we applied it to a function that already returns functions.

Let’s look at how we use it. We can use curry-maker to define mrember-curry with curry-maker.

(def mrember-curry
  (curry-maker curry-maker))

(println (mrember-curry '(curry chicken with curry rice)))
;//=>(chicken with rice)

Ok – that wasn’t a big deal – we replaced a zero value with another function. Let’s apply it further. We’ll use this zero replacement in curry-maker to write a function function-maker:

(def function-maker
  (fn [future]
    (fn [l]
      (cond
        (null? l) '()
        (= (first l) 'curry) ((future future) (rest l))
        true (cons (first l) ((future future) (rest l)))))))

;for yielding mrember-curry when applied to a fcuntion

;
(def mrember-curry
  (function-maker function-maker))
(println (mrember-curry '(curry chicken with curry rice)))
;//=>(chicken with rice)

Ok – we’re about half way there. Now for any internal expression inside a function, we can wrap it in an applied lamdba (fn in clojure) and still have it return the same result. We’ll do that for our function-maker function.

(def function-maker
  (fn [future]
    (fn [l]
      (cond
        (null? l) '()
        (= (first l) 'curry) ((fn [arg] ((future future) arg)) (rest l))
        true (cons (first l) ((fn [arg] ((future future) arg)) (rest l)))))))

(def mrember-curry
  (function-maker function-maker))
(println (mrember-curry '(curry chicken with curry rice)))
;//=>(chicken with rice)

Ok – that all still works ok.

Now – our function-maker is in a way double-curried. What if we curry it again?

(def function-maker
  (fn [future]
    ((fn [recfun]
      (fn [l]
        (cond
          (null? l) '()
          (= (first l) 'curry) (recfun (rest l))
        true (cons (first l) ((future future))))))
    (fn [arg] ((future future) arg)))))
;abstraction above to remove l
; just take my word on this for now

Now we’ll split our triple-curried function into two functions:

(def M
  (fn [recfun]
    (fn [l]
      (cond
        (null? l) '()
        (= (first l) 'curry) (recfun (rest l))
        true (cons (first l) (recfun (rest l)))))))

(def function-maker
  (fn [future]
    (M (fn [arg]
         ((future future) arg)))))

Now we’ll refactor mrember-curry without an explicit reference to function-maker:

;Now we'll change this
(def mrember-curry
  (function-maker function-maker))
;to this
(def mrember-curry
  ((fn [future]
     (M (fn [arg]
          ((future future) arg))))
    (fn [future]
      (M (fn [arg]
           ((future future) arg))))))

This is where the book recommends you take a rest. We’ll push on.

Now we’ll refactor this definition by allowing you to pass in M as a function:

(def Y
  (fn [M]
    ((fn [future]
       (M (fn [arg]
            ((future future) arg))))
      (fn [future]
        (M (fn [arg]
             ((future future) arg)))))))

(def mrember-curry (Y M))

(println (mrember-curry '(curry chicken with curry rice)))
;//=>(chicken with rice)

That wasn’t too bad. We just used the y-combinator on the mrember function. Now we’ll do it for length:

;using add1 from chapter 7 not chapter 4
(def add1
  (fn [n]
    (cons '() n)))

; now we'll look at using the y-combinator to look at the length of a list
(def L
  (fn [recfun]
    (fn [l]
      (cond
        (null? l) '()
        true (add1 (recfun (rest l)))))))

(def length (Y L))

(println (length '(curry chicken with curry rice)))
;//=>(() () () () ()) ie 5

Now we’ll do it again, but won’t pass in a definition for L – but just use the definition inline:

(def add1
  (fn [n]
    (+ 1 n)))

;just for the sake of it - we'll rewrite length without the L function
(def length
  (Y
    (fn [recfun]
      (fn [l]
        (cond
          (null? l) 0
          true (add1 (recfun (rest l))))))))

(println (length '(curry chicken with curry rice)))
;//=>5

Now we’ll rewrite length without Y or L:

(def length
  ((fn [M]
     ((fn [future]
        (M (fn [arg]
             ((future future) arg))))
     (fn [future]
       (M (fn [arg]
            ((future future) arg))))))
    (fn [recfun]
      (fn [l]
        (cond
          (null? l) 0
          true (add1 (recfun (rest l))))))))

(println (length '(curry chicken with curry rice)))
;//=>5

Ok that ends the Chapter – but we want to take this thing to the next level. Let’s use the ycombinator to start defining a lazy function:

;building a pair with an S-expression and a thunk leads to a stream
(def first$ first)

(def second$
  (fn [str]
    ((second str))))

; careful re use of first and second here - as yet undefined!

(def build
  (fn [a b]
    (cond
      true (cons a (cons b '())))))

(def str-maker
  (fn [next n]
    (build n (fn [] (str-maker next (next n))))))

(def int_ (str-maker add1 0))

(def even (str-maker (fn [n] (+ 2 n)) 0))

;sub1 from chapter 4
(def sub1
  (fn [n]
    (- n 1)))

(def frontier
  (fn [str n]
    (cond
      (zero? n) '()
      true (cons (first$ str) (frontier (second$ str) (sub1 n))))))

(frontier int_ 10)
;//=>(0 1 2 3 4 5 6 7 8 9)

So that got us the creation of a lazy data structure for a basic example. Now let’s try a more interesting one:

(def Q
  (fn [str n]
    (cond
      (zero? (rem (first$ str) n)) (Q (second$ str) n)
      true (build (first$ str) (fn [] (Q (second$ str) n))))))
; note new function call rem - re new primitve

(def P
  (fn [str]
    (build (first$ str) (fn [] (P (Q str (first$ str)))))))

(frontier (P (second$ (second$ int_))) 10)
;//=>(2 3 5 7 11 13 17 19 23 29)

What an interesting stream of numbers!

That’s enough for today.

You can see it working here.

References

Here are some additional references on deriving the Y-Combinator

The Little Schemer in Clojure – Chapter 8 – Friends and Relations

This is the eighth Chapter of a series of posts about porting The Little Schemer to Clojure. You may wish to read the intro.

So far we’ve covered flat data structuresreading flat data structurescreating data structurescreating a numeric tower, working with multiple occurrences of a match in the list,  changing our functions to work with nested lists and building a numerical expression evaluator.

This chapter we’ll look at sets, relations and functions (in the mathematical sense).  So starting with set? let’s see if a list contains a set of unique items.

(def set_?
  (fn [lat]
    (cond
      (null? lat) true
      true (cond
             (member? (first lat) (rest lat)) false
             true (set_? (rest lat))))))

(println (set_? '(toasted banana bread with butter for breakfast)))
;//=>true
(println (set_? '(breakfast toasted banana bread with butter for breakfast)))
;//=>false

Note that this assumes the implementation of member? from Chapter 5.

Now let’s simplify our definition of set?:

(def member
  (fn [lat]
    (cond
      (null? lat) true
      (member? (first lat) (rest lat)) false
      true (set? (rest lat)))))

(println (set_? '(toasted banana bread with butter for breakfast)))
;//=>true
(println (set_? '(breakfast toasted banana bread with butter for breakfast)))
;//=>false

Now given a non-unique list of items, let’s return a set of unique items based on the original list with makeset:

(def makeset
  (fn [lat]
    (cond
      (null? lat) '()
      (member? (first lat) (rest lat)) (makeset (rest lat))
      true (cons (first lat) (makeset (rest lat))))))

(println (makeset '(breakfast toasted banana bread with butter for breakfast)))
;//=> (toasted banana bread with butter for breakfast)
(println (set_? (makeset '(breakfast toasted banana bread with butter for breakfast))))
;//=>true

Now we’ll refactor makeset using our definition of multirember from Chapter 5:

(def makeset
  (fn [lat]
    (cond
      (null? lat) '()
      true (cons (first lat) (makeset (multirember (first lat) (rest lat)))))))

(println "")
(println "makeset - refactored with multirember")
(println (makeset '(breakfast toasted banana bread with butter for breakfast)))
;//=> (breakfast toasted banana bread with butter for)
; note other way around
(println (set_? (makeset '(breakfast toasted banana bread with butter for breakfast))))
;//=>true

While we’re looking at sets, let’s write a function to work out if one set is a subset of another:

(def subset?
  (fn [set1 set2]
    (cond
      (null? set1) true
      true (cond
             (member? (first set1) set2) (subset? (rest set1) set2)
             true false))))

(println (subset? '(banana butter) '(breakfast toasted banana bread with butter for breakfast)))
;//=>true
(println (subset? '(banana butter) '(toasted banana bread with butter for breakfast)))
;//=>true
(println (subset? '(peanut butter) '(toasted banana bread with butter for breakfast)))
;//=>false

Now we’ll refactor subset to remove the second conditional:

(def subset?
  (fn [set1 set2]
    (cond
      (null? set1) true
      (member? (first set1) set2) (subset? (rest set1) set2)
      true false)))

(println (subset? '(banana butter) '(breakfast toasted banana bread with butter for breakfast)))
;//=>true
(println (subset? '(banana butter) '(toasted banana bread with butter for breakfast)))
;//=>true
(println (subset? '(peanut butter) '(toasted banana bread with butter for breakfast)))
;//=>false

Now we’ll refactor subset again to make better use of the trailing final condition:

(def subset?
  (fn [set1 set2]
    (cond
      (null? set1) true
      true (and
             (member? (first set1) set2)
             (subset? (rest set1) set2)))))

(println (subset? '(banana butter) '(breakfast toasted banana bread with butter for breakfast)))
;//=>true
(println (subset? '(banana butter) '(toasted banana bread with butter for breakfast)))
;//=>true
(println (subset? '(peanut butter) '(toasted banana bread with butter for breakfast)))
;//=>false

Now we’ll write eqset to see if two sets are equal (regardless of order):

(def eqset?
  (fn [set1 set2]
    (cond
      (subset? set1 set2) (subset? set2 set1)
      true false)))

(println (eqset? '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>false
(println (eqset? '(toasted banana bread) '(toasted banana bread)))
;//=>true
(println (subset? '(toasted peanut butter for breakfast) '(toasted banana bread )))
;//=>false

Now we’ll refactor eqset to have only one condition line:

(def eqset?
  (fn [set1 set2]
    (cond
      true (and
             (subset? set1 set2)
             (subset? set2 set1)))))

(println (eqset? '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>false
(println (eqset? '(toasted banana bread) '(toasted banana bread)))
;//=>true
(println (subset? '(toasted peanut butter for breakfast) '(toasted banana bread )))
;//=>false

Now we’ll refactor eqset to remove the condition line altogether:

(def eqset?
  (fn [set1 set2]
      (and
        (subset? set1 set2)
        (subset? set2 set1))))

(println (eqset? '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>false
(println (eqset? '(toasted banana bread) '(toasted banana bread)))
;//=>true
(println (subset? '(toasted peanut butter for breakfast) '(toasted banana bread )))
;//=>false

Now we’ll write a test to see if two sets intersect?:

(def intersect?
  (fn [set1 set2]
    (cond
      (null? set1) false
      true (cond
             (member? (first set1) set2) true
             true (intersect? (rest set1) set2)))))

(println (intersect? '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>true
(println (intersect? '(toasted banana bread) '(toasted banana bread)))
;//=>true
(println (intersect? '(toasted peanut butter for breakfast) '(toasted banana bread )))
;//=>true
(println (intersect? '(strawberry yoghurt) '(toasted banana bread )))
;//=>false

Now we’ll refactor intersect? to remove the redundant second conditional:

(def intersect?
  (fn [set1 set2]
    (cond
      (null? set1) false
      (member? (first set1) set2) true
      true (intersect? (rest set1) set2))))

(println (intersect? '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>true
(println (intersect? '(toasted banana bread) '(toasted banana bread)))
;//=>true
(println (intersect? '(toasted peanut butter for breakfast) '(toasted banana bread )))
;//=>true
(println (intersect? '(strawberry yoghurt) '(toasted banana bread )))
;//=>false

Now we’ll refactor intersect? to make better use of the trailing conditional:

(def intersect?
  (fn [set1 set2]
    (cond
      (null? set1) false
      true (or
             (member? (first set1) set2)
             (intersect? (rest set1) set2)))))

(println (intersect? '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>true
(println (intersect? '(toasted banana bread) '(toasted banana bread)))
;//=>true
(println (intersect? '(toasted peanut butter for breakfast) '(toasted banana bread )))
;//=>true
(println (intersect? '(strawberry yoghurt) '(toasted banana bread )))
;//=>false

Now we’ll actually get the values at the intersection:

(def intersect
  (fn [set1 set2]
    (cond
      (null? set1) '()
      (member? (first set1) set2) (cons (first set1) (intersect (rest set1) set2))
      true (intersect (rest set1) set2))))

(println (intersect '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>(toasted banana bread)
(println (intersect '(toasted peanut butter for breakfast) '(toasted banana bread )))
;//=>(toasted)
(println (intersect '(strawberry yoghurt) '(toasted banana bread )))
;//=>()

Now we’ll refactor intersect to filter out non-members first:

(def intersect
  (fn [set1 set2]
    (cond
      (null? set1) '()
      (not (member? (first set1) set2)) (intersect (rest set1) set2)
      true (cons (first set1) (intersect (rest set1) set2)))))

(println (intersect '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>(toasted banana bread)
(println (intersect '(toasted peanut butter for breakfast) '(toasted banana bread )))
;//=>(toasted)
(println (intersect '(strawberry yoghurt) '(toasted banana bread )))
;//=>()

The opposite of getting the intersection of two sets is to get the union – we’ll do that now:

(def union
  (fn [set1 set2]
    (cond
      (null? set1) set2
      (member? (first set1) set2) (union (rest set1) set2)
      true (cons (first set1) (union (rest set1) set2)))))

(println (union '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>(breakfast toasted banana bread with butter for breakfast)
;// note not a set since not given a set
(println (union '(toasted peanut butter for breakfast) '(toasted banana bread )))
;//=>(peanut butter for breakfast toasted banana bread)
(println (union '(strawberry yoghurt) '(toasted banana bread )))
;//=>(strawberry yoghurt toasted banana bread)

The function next in the line is the complement of two sets:

(def complement_
  (fn [set1 set2]
    (cond
      (null? set1) '()
      (member? (first set1) set2) (complement_ (rest set1) set2)
      true (cons (first set1) (complement_ (rest set1) set2)))))

(println (complement_ '(toasted banana bread) '(breakfast toasted banana bread with butter for breakfast)))
;//=>()
(println (complement_ '(toasted peanut butter for breakfast) '(toasted banana bread )))
;//=>(peanut butter for breakfast)
(println (complement_ '(strawberry yoghurt) '(toasted banana bread )))
;//=>(strawberry yoghurt)

Now we’ll look at the intersection of more than two sets with intersect-all:

(def intersect-all
  (fn [l-set]
    (cond
      (null? (rest l-set)) (first l-set)
      true (intersect (first l-set) (intersect-all (rest l-set))))))

(println "")
(println "intersect-all")
(println (intersect-all
           '(
              (toasted banana bread)
              (breakfast toasted banana bread with butter for breakfast)
              (toasted peanut butter for breakfast)
              (toasted banana bread ))))
;//=>(toasted)

Here the book changes direction – our work on sets has been a building block for looking at functions and relations. Assuming that we can express a function as a set of value pairs – we need to be able to have the building blocks to work with pairs. We’ll start by getting the first and second value out of a pair:

(def first_
  (fn [p]
    (cond
      true (first p))))

(println (first_ '(a b)))
;//=>a

(def second_
  (fn [p]
    (cond
      true (first (rest p)))))

(println (second_ '(a b)))
;//=>b

Next we’ll add the ability to build a pair:

(def build
  (fn [a b]
    (cond
      true (cons a (cons b '())))))

(println (build 'a 'b))
;//=>(a b)

For the sake of keeping our brain going – we’ll get the third item of a pair just to keep you wondering:

(def third_
  (fn [p]
    (cond
      true (first (rest (rest p))))))

(println (third_ '(a b c)))
;//=>c

Now we’ll look closer at functions. You’ll remember that for a set of pairs to be a function, there must be only one domain value – ie the first values from a set of pairs must themselves be a set. We’ll borrow our definition of member* from Chapter 6.

(def firsts
  (fn [l]
    (cond
      (empty? l) '()
      true (cons (first (first l))
        (firsts (rest l))))))

(println "")
(println "firsts")
(println (firsts '((8 3)(4 2)(7 6)(6 2)(3 4))))
;//=>(8 4 7 6 3)

(def fun?
  (fn [rel]
    (set? (firsts rel))))

(println "")
(println "fun? - refactored to use set? and firsts")
(println (fun? '((4 3)(4 2)(7 6)(6 2)(3 4))))
;//=>false
(println (fun? '((8 3)(4 2)(7 6)(6 2)(3 4))))
;//=>true
(println (fun? '((8 3)(4 2)(7 1)(6 0)(9 5))))
;//=>true

Now just in case you ever wanted to reverse the elements in their pairs and order in the list – we bring you revrel. (It is actually a building block for other functions to come).

(def revrel
  (fn [rel]
    (cond
      (null? rel) '()
      true (cons
             (build
               (second_ (first rel))
               (first_ (first rel)))
             (revrel (rest rel))))))

(println (revrel '((4 3)(4 2)(7 6)(6 2)(3 4))))
;//=>((3 4) (2 4) (6 7) (2 6) (4 3))
(println (revrel '((8 3)(4 2)(7 6)(6 2)(3 4))))
;//=>((3 8) (2 4) (6 7) (2 6) (4 3))
(println (revrel '((8 3)(4 2)(7 1)(6 0)(9 5))))
;//=>((3 8) (2 4) (1 7) (0 6) (5 9))

So applying what we’ve just learned – a full function is one in which both the values of the domain are unique, and the values of the range are unique. We can test this with fullfun?:

(def fullfun?
  (fn [fun]
    (set_? (seconds_ fun))))

(println (fullfun? '((4 3)(4 2)(7 6)(6 2)(3 4))))
;//=>false
(println (fullfun? '((8 3)(4 2)(7 6)(6 2)(3 4))))
;//=>false
(println (fullfun? '((8 3)(4 2)(7 1)(6 0)(9 5))))
;//=>true

A simpler way to express this idea is that the domain of the function is unique – even if the range values are swapped with the domain values – we call this one-to-one:

(def one-to-one?
  (fn [fun]
    (fun? (revrel fun))))

(println (one-to-one? '((4 3)(4 2)(7 6)(6 2)(3 4))))
;//=>false
(println (one-to-one? '((8 3)(4 2)(7 6)(6 2)(3 4))))
;//=>false
(println (one-to-one? '((8 3)(4 2)(7 1)(6 0)(9 5))))

You can see this running here.

Conclusion:
This chapter we’ve looked at sets, relations and functions (in the mathematical sense). We haven’t really added any new primitives this chapter (thank goodness!).

In total our primitives so far are: atom?, null?, first, rest, cond, fn, def, empty?,=, cons, add1,, sub1, one? and expt. These are all the functions (and those in the chapters to come) that we’ll need to implement to get our metacircular interpreter working.

The Little Schemer in Clojure – Chapter 7 – Shadows

This is the seventh Chapter of a series of posts about porting The Little Schemer to Clojure. You may wish to read the intro.

So far we’ve covered flat data structuresreading flat data structurescreating data structurescreating a numeric tower, working with multiple occurrences of a match in the list, and changing our functions to work with nested lists.

In this chapter we’re driving towards our metacircular interpreter by starting to look at evaluating numerical values. So we’ll look at building an expression, breaking it down into operators and operands, and then handling it and returning a result. So let’s get started!

First we’re going to test with a test function to see if the S-expression passed in is a valid numeric expression that we can evaluate. We’re going to call this function numbered. But first we’ll expand Clojure’s definition of number? – so it can return a null for an invalid expression as we expect

;we need to define number_? to handle true/false
(def number_?
  (fn [a]
    (cond
      (null? a) false
      (number? a) true
      true false)))

(println (number_? 3))
;//=> true
(println (number_? 'a))
;//=> false

So now let’s do numbered to test for S-expressions that we can evaluate as a numeric expression.

(def numbered?
  (fn [aexp]
    (cond
      (atom? aexp) (number_? aexp)
      (= (first (rest aexp)) '+)
         (and
           (numbered? (first aexp))
           (numbered? (first (rest (rest aexp)))))
      (= (first (rest aexp)) '*)
         (and
           (numbered? (first aexp))
           (numbered? (first (rest (rest aexp)))))
      (= (first (rest aexp)) 'exp)
         (and
           (numbered? (first aexp))
           (numbered? (first (rest (rest aexp)))))
      true false)))

(println (numbered? '(a b c)))
;//=> false
(println (numbered? '(3 + (4 * 5))))
;//=> true
(println (numbered? '(3 a (4 * 5))))
;//=> false

Ok – so can we simplify numbered? Yes:

(def numbered?
  (fn [aexp]
    (cond
      (atom? aexp) (number_? aexp)
      true (and
             (numbered? (first aexp))
             (numbered? (first (rest (rest aexp))))))))

(println (numbered? '(a b c)))
;//=> false
(println (numbered? '(3 + (4 * 5))))
;//=> true
(println (numbered? '(3 a (4 * 5))))
;//=> true

Hmm – notice that the final expression became true in this version, but was false in the previous? That’s an assumption we’re making with this new version – that we know already this is an arithmetic expression. One might think this defeats the purpose of this function. I digress.

Now we’ll look at the Eight Commandment. This states that you recur on all subparts that are of the same nature including:
– On all sublists of a list
– On all the subexpressions of a representation of an arithmetic expression.
The big idea is that we’re starting to make a distinction between code and data. We’re saying that data functions recur on data, and code functions recur on code.

Now you’ll remember the infamous function eval from many languages – especially LISP. We’ll start writing the first version of eval for our metacircular interpreter starting with a basic version that just looks at arithmetic expressions. We’ll call it value:

(use 'clojure.math.numeric-tower)
; note use of new primitive expt

(def value
  (fn [aexp]
    (cond
      (number_? aexp) aexp
      (= (first (rest aexp)) '+) (+ (value (first aexp)) (value (first (rest (rest aexp)))))
      (= (first (rest aexp)) '*) (* (value (first aexp)) (value (first (rest (rest aexp)))))
      (= (first (rest aexp)) 'exp) (expt (value (first aexp)) (value (first (rest (rest aexp))))))))

(println (value '(1 + 1)))
;//=> 2
(println (value '(2 + 2)))
;//=> 4
(println (value '(3 exp 3)))
;//=> 27
(println (value '(4 b 4)))
;//=> nil

Now we’re going to take the function above and move from prefix to infix – just to illustrate how trivial the distinction is from an implementation point of view

(def value
  (fn [aexp]
    (cond
      (number_? aexp) aexp
      (= (first  aexp) '+) (+ (value (rest aexp)) (value  (rest (rest aexp))))
      (= (first (rest aexp)) '*) (* (value (first aexp)) (value (first (rest (rest aexp)))))
      (= (first (rest aexp)) 'exp) (expt (value (first aexp)) (value (first (rest (rest aexp))))))))

Now in my view that was a little silly. The book included a non-working version of value just to illustrate a violation of the Eighth Commandment. Let’s get on with it.

Now we’ll extract out helper functions to get the subexpressions and the operator

(def first-sub-exp
  (fn [aexp]
    (first (rest aexp))))

(def second-sub-exp
  (fn [aexp]
    (first (rest (rest aexp)))))

(def operator
  (fn [aexp]
    (first aexp)))

(def value
  (fn [aexp]
    (cond
      (number_? aexp) aexp
      (= (operator aexp) '+) (+ (value (first-sub-exp aexp)) (value  (second-sub-exp aexp)))
      (= (operator aexp) '*) (* (value (first-sub-exp aexp)) (value (second-sub-exp aexp)))
      (= (operator aexp) 'exp) (expt (value (first-sub-exp aexp)) (value (second-sub-exp aexp))))))

(println (value '(+ 1 1)))
;//=> 2
(println (value '(* 2 2)))
;//=> 4
(println (value '(exp 3 3)))
;//=> 27
(println (value '(b 4 4)))
;//=> nil

Cool – now – just to illustrate the power of abstraction by functions – we’ll switch back from prefix to infix functions.

(def first-sub-exp
  (fn [aexp]
    (first  aexp)))

(def operator
  (fn [aexp]
    (first (rest aexp))))

(println (value '(1 + 1)))
;//=> 2
(println (value '(2 * 2)))
;//=> 4
(println (value '(3 exp 3)))
;//=> 27
(println (value '(4 b 4)))

Time for another Commandment – The Ninth Commandment States Use helper functions to abstract from representations. Part of this is the java concept of eclipse’s extract method. Part of it is the power this gives you to change the function across the whole application and have it impact many parts of the system – ie a wide-ranging refactor.

Now we’re testing for null?. Although we had to write our own helper method for this in Chapter 1 – now as part of implementing the metacircular interpreter – we’re re-implementing it in terms of other functions so keep our api minimal.

;This is what we had
(def null?
  (fn [a]
    (or
      (nil? a)
      (= () a))))

;This is what they're proposing
(def null?
  (fn [s]
    (and
      (atom? s)
      (= s '()))))

Now it turns out this fails the conversion from Scheme to Clojure. We get the big idea. We’ll just move on.

Now we’re going to build our own numeric system out of parenthesis – again, just to make the implementation of our metacircular interpreter more pure and easier to implement. This is significant because we’re going to build up our own number representation out of parentheses – admittedly a strange exercise – but going to prove that we can do maths in a lisp implemented in itself.

We’ll start with zero?:

(def zero_?
  (fn [n]
    (null? n)))

(println (zero_? '()))
;//=>true
(println (zero_? 0))
;=>false

Ok – maybe that didn’t make sense – let’s give a few examples. So now zero is (), one is (()), and two is (()()).
Let’s do some operations on our new numeric representation – we’ll start with a new definition of add1.

(def add1
  (fn [n]
    (cons '() n)))

(println (add1 '()))
;//=>(()) ie 1
(println (add1 '(())))
;//=> (() ()) ie 2

Now let’s do sub1:

(def sub1
  (fn [n]
    (rest n)))

(println (sub1 '(()())))
;//=>(()) ie 1
(println (sub1 '(())))
;//=>() ie 0
(println (sub1 '()))
;//=> () unfortunately with our implementation sub1 of zero returns zero

Now we’ll reimplement our plus function:

(def +_
  (fn [n m]
    (cond
      (zero_? m) n
      true (add1 (+_ n (sub1 m))))))

(println (+_ '() '()))
;//=> ()
;//  zero plus zero is zero
(println (+_ '(()) '(())))
;//=> (()()) ie 1 plus 1 is 2

Now we’ll test for valid numbers in our new scheme by rewriting our function number?:

(def number_?
  (fn [n]
    (cond
      (null? n) true
      true (and
             (null? (first n))
             (number_? (rest n))))))

(println (number_? '() ))
;//=>true
(println (number_? '(()) ))
;//=>true
(println (number_? '((())) ))
;//=>false
(println (number_? '(()()) ))
;//=>true

You can see it running here.

Conclusion:
Why is this chapter called shadows? I haven’t figured that out. Perhaps it is a shadow of things to come.

Here we’ve started on the journey of the metacircular interpreter by writing an evaluator for arithmetic expressions. The only primitive we added was expt.

In total our primitives so far are: atom?, null?, first, rest, cond, fn, def, empty?,=, cons, add1,, sub1, one? and expt. These are all the functions (and those in the chapters to come) that we’ll need to implement to get our metacircular interpreter working.