Hierarchical
A hierarchical build is one that has a hierarchy of
modules. That is, it is possible for a module to be composed of
smaller, more specific child modules called submodules.
If a module has children, it is responsible for ensuring those
children modules are built in the proper manner. Later, we'll
discuss how the example build environment applies the hierarchical
concept.
Artifact-Driven
An artifact-driven build is one where each module or
submodule exists for the purpose of generating a single deployable
artifact. In Java projects, these artifacts are almost always .jar,
.war, or .ear files. In other types of builds, they are usually
binary executables or dynamically linked libraries (.dll or
.so).
The example build environment is also artifact-driven, and we'll
discuss how it creates deployable artifacts.
Although these three concepts are pretty easy to understand,
they become very powerful when incorporated into a build
environment.
Now let's take a first look into how the environment is
organized.
Modular Organization
When there is a lot to accomplish, it makes sense to break down
the problem into smaller parts. We need a good divide-and-conquer
technique to help manage the large amounts of source code. It makes
sense to do this in a build environment by creating build
modules.
We create a module by creating a directory under the application
root. This new directory becomes the module's base. Under each
module directory, we find all the files and source code related to
that module.
Here is a sample application's build environment, organized in
modules:
appname/
|-- admin/
|-- core/
|-- db/
|-- lib/
|-- ordermgt/
|-- reports/
|-- web/
|-- build.xml
And here's what each entry means:
- Each directory except for lib/ is a module. In
this sample environment, we have an
admin module that
provides implementations of business POJOs that allow someone to
administer the application (e.g., create users, assign permissions,
etc.). Likewise, there is a reports module, which is
where we can find the implementations of components that enable
report generation. The core module is sort of a
catch-all module for components that are used across any/all
modules and can't really be associated with just one system
function (e.g. StringUtil classes, etc.). Typically,
all other modules will depend upon the core module.
The other modules are just like the admin,
reports, and core modules: they each deal
with a respective system function that is mostly self-contained and different from any other module. Also, since our sample
app can support web-based interaction, we also have a web module,
which includes everything needed to build a .war file.
- The lib/ directory is a little special. It
contains all third-party .jars required to either build or run the
application. We keep all third-party .jars used by our modules in
this directory, instead of in the modules themselves, for three
reasons:
- It's easier to manage third-party dependencies from a single
location. Whether or not a module uses one of these libraries is
defined in a module-specific Ant
<path> entry in the module's
build.xml file.
- It avoids classloading or API version conflicts by eliminating
the possibility of duplicate .jars. If more than one module uses
Jakarta Commons Logging, which module is responsible for storing
the commons-logging.jar file? If each stored their own
copy, there is the potential that one module might have one version
and another module might have a different version. When the
application is being run, only the first .jar file found in the
classpath is used to satisfy the dependency, potentially causing a
conflict for the other module. We avoid that by managing only one
.jar at the root level.
- Third-party dependencies are versioned with your source
code. Often overlooked in many projects, this is the most
important reason why you want to store your dependency libraries in a
RCS. By doing this, you ensure that no matter what version or
branch of your software you check out, you'll always have the
proper versions of third-party libraries needed to run that
particular version of your software.
- The root build.xml file is primarily just a
management file. It is responsible for knowing what build files and
targets are necessary to build each module. The module
then does what it needs to ensure its artifact is built
properly.
For example, if the project was being built, and it was time to
build the ordermgt module, the root build file would
"know" to call an Ant task in the ordermgt/build.xml
file. The ordermgt/build.xml file would then know
exactly what is required to create the ordermgt .jar file. Also, if
this project could be built and entirely consolidated into a .ear
file, this build.xml file would be responsible for building that
.ear.
How does the
root build.xml file know to build the modules and the
order in which they are to be built for any given target? Here's a snippet
of Ant XML that shows how:
<!-- =========================================
Template target. Never called explicitly,
only used to pass calls to underlying
children modules.
========================================= -->
<target name="template" depends="init">
<-- Define the modules and the order in which
they are executed for any given target.
This means _order matters_. Any
dependencies that are to be satisfied by
one module for another must be declared
in the order the dependencies occur. -->
<echo>Executing "${target}" \
target for the core module...</echo>
<ant target="${target}" dir="core"/>
<echo>Executing "${target}" target for the admin module...</echo>
<ant target="${target}" dir="admin"/>
...
</target>
This template target passes on whatever build
target is called on this root build.xml file to the
children modules in a known order. For example, if we wanted to
clean the entire project, you would only have to call the
clean target at the root of the project, and the
following task is executed:
<!-- =========================================
Clean all modules.
========================================= -->
<target name="clean" depends="init">
<echo>Cleaning all builds"</echo>
<antcall target="template">
<param name="target" value="clean"/>
</antcall>
</target>
This root clean target is explicitly called and the
build.xml file in turn implicitly calls the template
target, which ensures that all modules are cleaned.
The above modular organization and related build targets really
makes managing source code and builds easier. The structure helps
you find code you want to work with faster and more easily. And the
template target organizes how things are executed.
But here's the best part of the modular structure:
After doing a full build on the whole project, any module can be
built independently of the full build. Just change in to the module
directory on the command line and run:
> ant target
and that module's build.xml file takes over. You can run any target
at any level in the build, and only that level will be built.
Why is this important? Because it allows you to work
independently in your module space and build just that module. Each
change you make to a module's source file doesn't require you to
build the entire project all over again. This is a huge
time-saver in larger projects.
Now we'll take a look at how an individual module is
structured.
We organize a module's directory structure corresponding to
common Java industry conventions for source code management.
Although there are different conventions, this is the directory
structure used in our build environment:
modulename
|-- build/
|-- etc/
|-- src/
|-- test/
|-- build.xml
Here's what each entry means:
-
build: This directory is special in that it is generated
by the module build. All other directories and files listed above
are entered into the RCS. The build directory contains all
files generated during the build process, from auto-generated XML
to compiled Java class files, and finally any distribution
artifacts (.war, .jar, .ear, etc.). This makes it very easy to clean
a build by just deleting this directory.
-
etc: This is a directory where all config files that are used by the module during build or run time are stored. Most
of the time, you'll find properties files and XML config files in
here, such as log4j.properties or struts-config.xml.
If there are a lot of files, they're typically organized into
subdirectories for the components they relate to; e.g.,
etc/spring/, etc/struts/, etc/ejb/, etc.
-
src: This directory is the root of your source file
directory tree. There are no other directories in it other than
those that directly correspond to a package and/or classpath
location. So you'll usually see a com/ or
net/ or org/ directory here starting a
com.whatever or net.something or
org.mydomain package structure. It is important to
note that only things that have a one-to-one classpath correspondence
are saved in this directory (i.e., package directories or .java
source files).
-
test: This directory is for your test classes (e.g.
JUnit test cases). The
important thing here from an organization perspective is the
package structure mirrors exactly that found under the
src directory. This makes it very convenient for
managing test cases, because you instantly know that the class:
moduleroot/test/com/domain/pkg/BusinessObjectTest
is a test case for the class
moduleroot/src/com/domain/pkg/BusinessObject.
This simple mirroring technique is very helpful in
managing large amounts of code. It's very easy to find your test
cases.
-
build.xml: This Ant file knows how to do everything
needed by this module to build and distribute the artifact for which it is
responsible. If this module has any submodules, it also knows
how to build those submodules and in which order they
should be built. Submodules and build ordering are very important
concepts that we'll cover shortly.
Submodules
A submodule is just a module that is a child of another (parent)
module. You might have seen other module-based Ant builds where the
hierarchy is flat; i.e., one level deep. Our build structure goes a
little further than that: ours is two levels deep.
Continuing with our build and the concept of submodules, you
would see a build hierarchy like the following, with the module and
submodule directories expanded:
module1/
submodule1.1/
|-- etc/
|-- src/
...
|-- build.xml
submodule1.2/
|-- etc/
|-- src/
...
|-- build.xml
build.xml
module2/
...
OK, so this looks a little complex. Why would we want to do
this?
Well, let's preface the answer with a little background on
enterprise applications and the concept of an
artifact-driven build.
Enterprise applications are almost always client/server-based.
Even if you only deploy a web application, it's usually architected
as a client-server MVC application. That is, the web page itself is
a client view, but the "server"-side components are usually
business POJOs that execute business logic on behalf of the
component rendering the web page. Even if they are deployed in a
single .war, there is a definite architectural separation
between code that is primarily used for rendering a view (client
code) versus code that is used for processing business requests
(server code). At least, there should be!
The notion of client and sever code becomes more obvious in a
more traditional client/server application where there is a
standalone client GUI communicating with a server-side business
object via sockets.
It would be very clean and elegant if we only needed to deploy
client code to the client application and server code to the
application server. Both tiers also probably share common code, so
it would be nice to send common .jars to both client and server.
This is the cleanest way to deploy code and manage dependencies
between tiers. Our build environment has the ability to create
artifacts exactly as desired.
Next we will look at how submodules help us achieve an
artifact-driven build.
Hierarchy and Build Artifacts
The deployment scenario just described surfaces a desire for an
artifact-driven build: each module or submodule in the build
environment should be responsible for creating an artifact that
will be deployed to the client or server or both. This is easily
done in our build environment by further breaking down the modules
in our sample application into common,
client, and server submodules. The
parent-child relationship and delegation of build responsibilities
is what makes this build hierarchical as well.
Using our sample application's admin module, lets
see what the hierarchy looks like in an expanded directory
tree:
appname/
|-- admin/
|-- common/
|-- etc/
|-- src/
|-- test/
|-- build.xml
|-- client/
|-- etc/
|-- src/
|-- test/
|-- build.xml
|-- server/
|-- etc/
|-- src/
|-- test/
|-- build.xml
|-- build.xml
...
Each submodule's contents are structured as defined before, but there's a noticeable
difference.
The admin module does not have the typical module contents. It
just has submodules and a build.xml, and it doesn't
produce any artifacts itself. Instead it calls build targets in the
common/build.xml, server/build.xml, and
client/build.xml files via the template technique described earlier.
So if you wanted to build the admin module, you just change into
the admin directory and run Ant:
> cd admin/
> ant
This command uses the admin build.xml file, which in
turn builds the common, server, and
client submodules. After each submodule is built,
there will be three resulting artifacts:
appname-admin-common.jar
appname-admin-server.jar
appname-admin-client.jar
The common and server .jars can then be
deployed to the server (e.g., in an .ear file), and the
common and client .jars can be deployed to
the client (e.g., in a .war's WEB-INF/lib
directory).
What is the purpose of each submodule? Well, they help organize
code into cleanly managed subsets of functionality that will be
deployed in different tiers of the application. Here's what the
above three submodules typically contain:
common: All code that is common to both client
and server tiers for the module. This typically means business POJO
interfaces, utility classes, etc.
server: Class implementations only needed on the
server tier. These are generally implementations of business POJO
interfaces, DAO implementations for EIS access, etc.
client: Class implementations only needed on the
client tier, such as Swing GUI objects, EJB remote interfaces,
etc.
This kind of granularity of submodules and their respective
deployment artifacts benefits you in four substantial ways:
- Download times: You can ensure that standalone client
applications such as applets and Java Web Start
applications receive the smallest subset of .jars required to run.
This ensures the fastest possible download times of an application
or applet being run for the first time.
- Dependency management: Via an Ant
<path>
entry in the submodule's build.xml file, you can list
exactly which other module and/or submodules are allowed
as dependencies by the current submodule. This eliminates any lazy
or accidental use of APIs that a developer is not supposed to use
or won't be supported during runtime.
- Dependency ordering: Because the parent module determines
build order for submodules, you can rest assured that the
client code you write can depend upon
common code, but not server code. Also,
common code cannot be written that is dependent upon
server or client code. If you do these
things, your build will break, and you'll instantly be alerted that
you accidentally used classes that you shouldn't have. This may
sound like a small or nit-picky issue, but this problem quickly
rears its head in complex projects or those where the developers
have different levels of experience and may not be aware of
dependency management.
- Just as you can with modules, you can build just a single
submodule by entering in its directory and running
> ant
and Ant it will build only that submodule, saving you time.
Conclusion
Modules and submodules may look complicated. They probably look
like overkill to you at this point. But trust me from experience,
they greatly simplify how you manage source code and
dependencies, and how Ant builds your product. The structure
defined here really does make product-feature and source-code
management easier in a team environment. It takes a lot of the
guess work out of figuring out how to do all of the organization
yourself, and once set up, is pretty transparent. If you're
starting a new client/server project, give it a shot. You'll spend
more time working on your application, and less time worrying about
configuration management.
Special thanks to Jeremy Haile of Transdyn Controls for his valuable
input and review of this article.
Les A. Hazlewood
is the director of software engineering at Roundbox Media in Atlanta, Georgia.
|