Modules and the Toplevel
There are several pragmatics involving modules and the toplevel that are important to master to use the two together effectively.
Loading Compiled Modules
Compiling an OCaml file produces a module having the same name as the file, but
with the first letter capitalized. These compiled modules can be loaded into
the toplevel using #load
.
For example, suppose you create a file called mods.ml
, and put the following code in it:
let b = "bigred"
let inc x = x+1
module M = struct
let y = 42
end
Then suppose you type ocamlbuild mods.byte
to compile it. Inside the
_build
directory you will now find the files that ocamlbuild
produced. One of them is mods.cmo
: this is a compiled
module object file, aka bytecode.
You can make this bytecode available for use in the toplevel with the following
directives (recall that the #
character is required in front of a directive, it is not
part of the prompt):
# #directory "_build";;
# #load "mods.cmo";;
The first directive tells utop to add the _build
directory to the path in which it
looks for compiled (and source) files. The second directive loads the bytecode found
in mods.cmo
, thus making a module named Mods
available to be used. Both
of these expressions will therefore evaluate successfully:
# Mods.b;;
- : string = "bigred"
# Mods.M.y;;
- : int = 42
But this will fail:
# inc;;
Error: Unbound value inc
It fails because inc
is in the namespace of Mods
. Of course, if you
open the module, you can directly name inc
:
# open Mods;;
# inc;;
- : int -> int = <fun>
Initializing the Toplevel
If you are doing a lot of testing of a particular module, it can be
annoying to have to type those directives (#directory
and #load
)
every time you start utop. You really want to initialize the toplevel
with some code as it launches, so that you don't have to keep typing that
code.
The solution is to create a file
in the working directory and call that file .ocamlinit
. Note
that the .
at the front of that filename is required and makes
it a hidden file that won't appear in directory listings
unless explicitly requested (e.g., with ls -a
). Everything
in .ocamlinit
will be processed by utop when it loads.
For example, suppose you create a file named .ocamlinit
in the same
directory as mods.ml
, and in that file put the following code:
#directory "_build";;
#load "mods.cmo";;
open Mods
Now restart utop. All the names defined in Mods
will already be in scope.
For example, these will both succeed:
# inc;;
- : int -> int = <fun>
# M.y;;
- : int = 42
Requiring Libraries
Suppose you were to add the following lines to the end of mods.ml
:
open OUnit2
let test = "testb" >:: (fun _ -> assert_equal "bigred" b)
If you try to recompile the module with ocamlbuild mods.byte
, it will
fail. The problem is that you need to tell the build system to include
the third-party library OUnit. Recompiling with
ocamlbuild -pkg oUnit mods.byte
will, as usual, succeed.
But if you restart utop, there will be an error message:
File ".ocamlinit", line 1:
Error: Reference to undefined global `OUnit2'
The problem is that the OUnit library hasn't been loaded into utop yet. It can be with the following directive:
#require "oUnit";;
Now you can successfully load your own module without getting an error.
#load "mods.cmo";;
Moreover, if you add that #require
directive to .ocamlinit
anywhere before the
#load
directive, the "undefined global" error will go away.
Dependencies
When compiling a file, the build system automatically figures out which other files it depends on, and recompiles those as necessary. The toplevel, however, is not as sophisticated: you have to make sure to load all the dependencies of a file.
Suppose you have a file named mods2.ml
in the same directory as mods.ml
from above, and mods2.ml
contains this code:
open Mods
let x = inc 0
If you run ocamlbuild -pkg oUnit mods2.byte
, the compilation will
succeed. You don't have to name mods.byte
on the command line, even
though mods2.ml
depends on the module Mod
. The build system is smart
that way.
Also suppose that .ocamlinit
contains exactly the following:
#directory "_build";;
#require "oUnit";;
If you restart utop and try to load mods2.cmo
, you will
get an error:
# #load "mods2.cmo";;
Error: Reference to undefined global `Mods'
The problem is that the toplevel does not automatically load the modules that
Mods2
depends upon. There are two ways to solve this problem.
First, you can manually load the dependencies, like this:
# #load "mods.cmo";;
# #load "mods2.cmo";;
Second, you could instead tell the toplevel to load Mods2
and recursively
to load everything it depends on:
# #load_rec "mods2.cmo";;
And that is probably the better solution.
Load vs Use
There is a big difference between #load
-ing a compiled module file and #use
-ing
an uncompiled source file. The former loads bytecode and makes it available for use.
For example, loading mods.cmo
caused the Mod
module to be available,
and we could access its members with expressions like Mod.b
.
The latter (#use
) is textual inclusion: it's like typing the contents of
the file directly into the toplevel. So using mods.ml
does not cause
a Mod
module to be available, and the definitions in the file
can be accessed directly, e.g., b
.
For example, in the following interaction, we can directly refer to b
but
cannot use the qualified name Mods.b
:
# #use "mods.ml"
# b;;
val b : string = "bigred"
# Mods.b;;
Error: Unbound module Mods
Whereas in this interaction the situation is reversed:
# #directory "_build";;
# #load "mods.cmo";;
# Mods.b;;
- : string = "bigred"
# b;;
Error: Unbound value b
So when you're using the toplevel to experiment with your code, it's often
better to work with #load
, because this accurately reflects how your modules
interact with each other and with the outside world, rather than #use
them.