Rails autoloading
There is much to be said about Rails autoloading and particularly about
ActiveSupport::Dependencies
(AS::D
for short). What does it do? As the title
says it loads constants automatically and reloads your code catching changes on
every request. Ok, why do we need it? Because it’s convenient! We don’t have to
write require
in every single file and watch which constants we need and when.
Rails loads constants and tracks what we need automatically. Also there’s no
need to reload your server every time you’ve made a change. There are some
pitfalls that you must know, but once you learn them everything will be ok.
How it works
Previously discussed method self.const_missing(const_name)
is the entry point
for AS::D
. Consider this example:
# /autoloadable/a.rb
# module A
# end
require 'active_support/dependencies'
ActiveSupport::Dependencies.autoload_paths = ['/autoloadable']
A
AS::D
loads module A
automatically without any require
.
# Meanwhile you can add sleep here and make changes in a.rb
ActiveSupport::Dependencies.clear # Removes A from memory
A # Will load A again
I’ve just introduced the method Rails reloads your code with — clear
. Let’s
dig dipper and see how AS::D
actually works. There are two different constant
loading strategy: :require
and :load
(default). The first one means that all
the constants won’t be reloaded, as require
does eventually. The last one
means that all the constants will be removed from memory and loaded again on
demand. What does removing constant from memory mean? There’s a method called
Module#remove_const(sym)
:
module A; end
Object.send(:remove_const, 'A')
A # => uninitialized constant A (NameError)
But what if we remove constant for existing instance of a class:
class A
def self.value
'value'
end
end
a = A.new
Object.send(:remove_const, 'A')
a.class # => A
a.class.value # => 'value'
A # => uninitialized constant A (NameError)
That’s interesting because constant name was removed from memory but its
instance still shows us its class and we can even call class methods on it. It
turns out that this method only removes constant name from Object
but a class
is an object and a constant is a variable referencing that object still exists
in the memory. Ok, move on.
Calling require 'active_support/dependencies'
injects a few modules into basic
Ruby classes via AS::D.hook!
.
def hook!
Object.class_eval { include Loadable }
Module.class_eval { include ModuleConstMissing }
Exception.class_eval { include Blamable }
end
Methods injected into Object
overwrite methods like load
and require
in
order to monitor new constants, that said, it depends on loading strategy.
Another method you could already know is require_dependency
also injected into
Object
. Methods in Module
define an entry point const_missing
. I’m going
to show you how AS::D
works with this example:
# /autoloadable/a.rb
# module A
# end
require 'active_support/dependencies'
ActiveSupport::Dependencies.autoload_paths = ['/autoloadable']
A
- Declaring constant
A
triggersModule#const_missing
, that was overwritten byAS::D
andDependencies.load_missing_constant(from_mod, const_name)
is invoked. The first argument isObject
becauseA == Object::A
the second is:A
. load_missing_constant
tries to find the path for this constant which by convention lies somewhere in one of theautoload_paths
by means ofsearch_for_file('a')
. It just returns the first file it can find withFile.file?(File.join(autoload_path, 'a.rb'))
- Then
require_or_load
method is invoked, whose behavior depends on loading strategy. Forrequire
it just requires the given file, forload
load_file('/autoloadable/a.rb', 'A')
is invoked. - Then it looks for new constants in given namespaces with:
new_constants_in(*parent_paths) { Kernel.load(path) }
whereparent_paths
is[Object]
, because we try to resolveObject::A
andpath
is the path to file. How does it monitor it? It’s just the difference between the array of constants before and after.Object.local_constants
gives us all the constants insideObject
. - When it’s done, all newly defined constants append to
autoloaded_constants
. That’s all, new constant was defined withAS::D
AS::D.clear
removes constants from memory and clears all the auto-loaded
constants inside AS::D
. Pay your attention to require_dependency
. This
method adds all newly defined constants to autoloaded_constants
so that they
will be reloaded unlike the require
method.
Another convention AS::D
uses is a path convention. This example
'A::B'.underscore # => 'a/b'
simply illustrates that ::
is treated as a /
,
which gives us ability to use folders like module namespaces:
# /autoloadable/a.rb
# module A
# end
# /autoloadable/a/b.rb
# module A
# module B
# C = 'c'
# end
# end
A::B::C => 'c'
or even so:
# /autoloadable/a/b.rb
# module A::B
# C = 'c'
# end
A::B::C => 'c'
Although we haven’t defined module A
, it has been created for us automatically
because of existing directory a
. It’s really convinient, because you are not
forced to create an empty module just for namespacing purpose.
Misconception
Since Ruby passes only one argument to const_missing(const_name)
we don’t have
an idea about the nesting, this example works as expected:
# /autoloadable/b.rb
module B
end
# /autoloadable/a.rb
module A
B # => B
end
but this shows wrong results:
# /autoloadable/a.rb
module A
end
# /autoloadable/b.rb
module B
end
p A::B # => B
Is that right? I don’t think so. If we had used pure Ruby it would have thrown
NameError: uninitialized constant A::B
, because we explicitly said we need B
inside of A
but not the top-level B
. But this is Ruby’s fault it passes so
little info to const_missing
and AS::D
can do nothing with it.
Another cool case:
# /autoloadable/a.rb
# module A; end
# /autoloadable/namespace/a/b.rb
# module Namespace::A::B
# A
# end
Namespace::A::B # => What is A inside this namespace?
What would you expect from this example? I think in Ruby it’s obviously
top-level A
, but since AS::D
doesn’t know about nesting, would you expect it
to be either Namespace::A
or A
? Neither, because it’s NameError
, which is
much more confusing. It’s the last attempt of AS::D
to make an assumption that
since one of our parents has this constant
Namespace.const_defined?(:A, false) # => true
then we’re definitely looking
for this constant in a short form from_mod::const_name
, otherwise Ruby would
return it without calling const_missing
and we don’t have to search it upwards
in from_mod
. Why don’t we have to search it right at the top-level? May be
because of this:
# c.rb
# C = 'c'
# a/c.rb
# module A
# C = 'ac'
# end
# a/b.rb
# module A::B
# C
# end
A::B::C # => 'ac'
A::B::C # => uninitialized constant A::B::C
Calling it twice gives us an error, because of the same case we’ve considered
above. The first time AS::D
resolves C
through as usual, but the second time
it starts checking enclosing modules and since A
contains C
then Ruby must
have resolved it or otherwise it’s the short form and an error is thrown.
Imagine we’ve fixed it and instead of error we start loading top level constant.
Is it any better? Now we have two different constants calling it twice in a row,
which is still sad. That’s why I’d prefer an error.
Thread safety
Let’s write our own simplified autoloading:
# autoloadable/a.rb
# module A
# sleep 5
# def self.hello
# 'hello'
# end
# end
class Module
def const_missing(name)
require "./autoloadable/#{name.downcase}.rb"
Object.const_get(name)
end
end
t1 = Thread.new { A.hello }
t2 = Thread.new { A.hello }
t1.join; t2.join
The result is undefined method 'hello' for A:Module (NoMethodError)
. Why?
Because second thread takes over when the first is awaiting sleeping. At that
moment module A
is already defined but method hello
isn’t defined yet. It
turns out that AS::D
will never be thread safe until const_missing
isn’t
thread safe. That’s the reason why your production environment loads all the
constants on initialization step. In fact, you can see different errors even
with MRI, because it switches threads when waits for input-ouput or network. For
instance, circular dependency arises, when the first thread waits for
input-ouput and another thread starts resolving the same constant, which is
already in the list of loaded.
Known errors
- Toplevel constant B referenced by A::B
Have you ever seen the ‘Toplevel constant B referenced by A::B’? It’s easy to
reproduce even without AS::D
:
class B; end
class A; end
A::B
Since A.ancestors
is [A, Object, Kernel, BasicObject]
and contains class
Object
and B
are already defined as top level constants, Ruby shows us a
warning that constant we’re trying to resolve inside A
references top level
constant. Notice that for modules the situation is different:
module B; end
module A; end
A::B
It gives us just ‘uninitialized constant A::B (NameError)’ because the ancestors
chain doesn’t contain Object
.
- Circular dependency detected while autoloading constant
This is yet another AS::D
error that you could see:
# /autoloadable/a.rb
# B
# module A
# end
# /autoloadable/b.rb
# A
# module B
# end
A
If we try to access constant A
we’ll see this error. AS::D
makes an
assumption that this constant is defined in file a.rb
. When it loads this file
it finds another undefined constant B
and this time faces still undefined A
while loading file b.rb
. This generates infinite recursion and to prevent it
an error must be raised. This error as and many others also appears in
multi-threaded environment.
- A copy of
A
has been removed from the module tree but is still active!
# /autoloadable/money.rb
# class Money
# end
# /autoloadable/customer.rb
# class Customer
# def money
# Money.new
# end
# end
customer = Customer.new
ActiveSupport::Dependencies.clear
customer.money
It happens because class for Customer
was autoloaded, but class for Money
wasn’t because we haven’t invoked Money.new
before AS::D.clear
. Then
Customer
was removed as a reference, but instance of this class is still in
memory, and then we’re trying to resolve constant name Money
, but for
customer.class::Money
which is different from newly loaded Customer
. Please
note that saving the whole instance somewhere between sessions increases chances
you’ll see this error.
Conclusion
You don’t have to care about all this stuff if you have flat hierarchy. It means when you don’t use namespaces and all the files have different names, but it’s hard if you have a lot of classes/modules. So you must have clear understanding of this if you don’t want to be in trouble:
Don’t use the same name for top-level and namespaced constants.
Be careful when you use short form declaration unless you know what you do.
Be careful if you use short form declaration and it contains the name of the constant you try to resolve, use full path instead.
Carefully work with constants in initializers, you may declare new constant instead of loading original.
Try to run you application with eager loading or in production mode since it may load your code in another order.