Fluent Interfaces in Coffeescript

We’ve all seen them – builder patterns that make object construction clean and readable.

person().named(‘Bob’).withSpouse(‘Alice’).bornOn(’01-26-1982′).build()

I used to do these all the time in Java (we called them fluent interfaces), and I just realized today that I had no idea how to do this style in Coffeescript. Well, lets remedy that.

To get started, I’m going to follow the basic pattern I’ve followed in Java. Since CoffeeScript provides native class functionality, its a pretty simple clone.

class Person

  named: (name) ->
    @name = name
    @

  withSpouse: (spouse) ->
    @spouse = spouse
    @

  bornOn: (dob) ->
    @dob = dob
    @

  build : ->
    return {
      name: @name
      spouse: @spouse
      dob: @dob
    }

console.log new Person().named('Adam').withSpouse('Rachel').build()      

But hey, this is coffeescript. We can do better. Lets use a attribute shortcut to reduce the code length.

class Person
  
  named: (@name) ->
    @

  withSpouse: (@spouse) ->
    @

  bornOn: (@dob) ->
    @

  build : ->
    return {
      name: @name
      spouse: @spouse
      dob: @dob
    }

console.log new Person().named('Adam').withSpouse('Rachel').build()      

I suspect there may be an even cleaner way to do this, but this seems concise enough for now.

The full source code is available here: https://github.com/adamnengland/coffee-fluent-interface

Advertisements
Tagged

Heroku vs NodeJitsu vs Appfog

For the next few months, I’ll be working with the team at LocalRuckus, building a new Node.js API and application.  As a small shop with no dedicated Sys Admin or Dev Ops, its essential that we find Node.js hosting that is flexible, fast, and cost-effective.  I’ve been considering three major players in the Node.js hosting scene, Heroku, Nodejitsu, and Appfog.  There are some good comparisons out there (I especially like Daniel Saewitz’s article), but I wanted to share my 2 cents.

Value for Development

Heroku provides a great feature for development/sandbox apps – your first dyno is FREE.  Combine this with the starter Postgres package, and you can have a development version of you app up and running for $0/month.

Nodejitsu does not offer a free tier, so you are on the hook for paying for pet projects, etc.  That said, their pricing starts at $9/mo for a micro package, and scales up pretty gently from there.

Appfog provides a pretty great package for trying out an app.  You can provision your database, caching server, queue server, and application servers in a few clicks, all managed from one central dashboard.

Winner:  Appfog

Value for Production

Heroku pricing scales linearly with your traffic.  Using a simple slider, you can add new dynos to your application.  Each new dyno runs $35/mo, however, there is no commitment – you can scale up for brief spikes, and scale down if traffic subsides.

Nodejitsu and Appfog, on the other hand have fixed monthly prices.

Nodejitsu prices based on drones, which seem to offer 256MB RAM and processing power roughly equivalent to half a heroku dyno.

Appfog prices based on RAM, which creates a bit of a problem.  While 2GB of memory can be had for $20/mo, moving up to 4GB is a rather steep $100/mo.

Winner: Heroku

Deployment

Heroku – Deploy to a git repository

Appfog – Use the downloadable af tool to push updates

NodeJitsu – Use the jitsu tool or git integration.

Winner: Nodejitsu

Database

Heroku leverages expertise in Postgres, providing plans that scale with your application, including free database levels for getting started.  Production databases support 1TB of data, starting at $50/month.  If you prefer another database platform, Add-Ons are available for Redis, Mongo, Couchdb, ClearDB, JustOneDB, and Amazon RDS.

Nodejitsu continues to take a fairly minimalist approach, with no built-in database.  They provide Add-Ons for Mongo, Redis, and CouchDb.

Appfog allows you to use your service instances to host Redis, Mongo, Postgres, or MySql databases.  They also provide add-ons for Mongo and ClearDB.  The main knock here is that your database shares the processing and memory quotas with your other services, and I’m skeptical that such an approach could support a high traffic app.

Winner: Heroku. Production quality Postgres, with great Add-Ons for a variety of other databases.

Other Languages

Heroku Nodejitsu Appfog
  • Ruby
  • Java
  • Python
  • Clojure
  • Scala
  • Play
  • Php
    None
  • Ruby
  • Java
  • Python
  • Php

Tie: Appfog & Heroku. Appfog’s PHP support opens a lot of opportunities (such as hosting WordPress), however, Appfog seems to have trouble keeping their runtimes up to date. As of July 2013, Node.js was only up to version 8. Heroku provides good language options, and a serious commitment to keeping the runtimes up to date.

Other Considerations and Final Thoughts

An important consideration for many node apps is web socket support.  Nodejitsu has it, the others don’t.  If you need this feature, your choice is clear.  At this point, Heroku’s flexibility, large community, and great add-ons make it my go-to for applications, however, I think Appfog has put together a great offering, and I’m looking forward to using it more in the future.

Introducing Borderizer – Helping Travelers Move Faster

On a recent trip to San Diego, my wife and I crossed the U.S./Mexico border at San Ysidro to visit the lovely city of Tijuana. A 5 minute walk across the border was all it took to enter Mexico.

After a few hours of touring the Mexican shops (and the accompanying drug offers), we were ready to return to the United States. However, returning to the U.S. is not nearly as easy as leaving it. The customs wait at the border checkpoint was over 2 hours long. A miserable way to spend 2 hours of my vacation, and a problem that I won’t tolerate any longer.

Enter Borderizer…

Borderizer provides you with up to the minute stats on how long the wait is at any border crossing in Mexico or Canada. Equipped with this information, you can plan your trip better. If the wait at San Ysidro is 2 hours, just relax, have a bite to eat at a local taqueria, and plan your customs wait for late afternoon, when the crowds have died down.

Available for iPhone in the App Store

iphone_ss1     iphone_ss2

Available for Android on Google Play
android_ss1     android_ss2

Best of all, Borderizer is FREE.  Yep, $0.00  So go ahead and download today.  We’ll be rolling out new features in the upcoming months, including Spanish language support, suggestions on the best time to cross, and maps of the border crossings.  Let us know if you have any suggestions for featues you’d like to have in Borderizer – we’d love to make it more useful for everyone.

Heroku Scheduler with Coffeescript Jobs

Heroku provides a free add on for running scheduled jobs.  This provides a convenient way to run scheduled tasks in an on-demand dyno, freeing your web dynos to focus on user requests.   I’m currently writing a node.js application in coffeescript that has some modest job scheduling needs.

Heroku’s documentation is a little thin on explaining this particular use case, however a good starting point is reading the One-Off Dyno Documentation.  The important concept to remember is that if you can run your command using “heroku run xxx”, you’ll be able to run it in the scheduler.  These one-off dynos should be place in a bin/ directory in your project root.

My first attempt is below.  Note that we set it to use our node install to run (Heroku installs node at /app/bin/node).

#! /app/bin/node
require('../server/jobs/revenueCalculator').runOnce()

Deploy to heroku and run the following command using the toolbelt.  We get an error immediately.

heroku run runJob
Error: Cannot find module '../server/jobs/revenueCalculator'

Next I wanted to try running interactively, using the coffeescript interpreter:

heroku run coffee
Running `coffee` attached to terminal... up, run.6960
coffee> x = require('./server/jobs/revenueCalculator')
{ start: [Function], runOnce: [Function] }
coffee> x.runOnce()
Job Started
Job Complete

Now we seem to have zeroed in on the problem. Perhaps the script will work if we run it using coffeescript, rather than the node executable. Edit bin/nightlyJob as follows:

#! /app/node_modules/.bin/coffee
job = require '../server/jobs/revenueCalculator'
job.runOnce()

Deploy to heroku and run.

heroku run nightlyJob
Running `nightlyJob` attached to terminal... up, run.9139
Job Started
Job Complete

Using #! /app/node_modules/.bin/coffee in a standalone script to call the application code seems to do the trick. Now, lets add the heroku scheduler to our app, and configure the job to run nightly.

heroku addons:add scheduler:standard
heroku addons:open scheduler

A browser should pop open, and we can schedule our nightly job. Screen Shot 2013-04-23 at 11.07.48 AM

Thats all folks.

Tagged , ,

Clustering Web Sockets with Socket.IO and Express 3

Node.js gets a lot of well-deserved press for its impressive performance. The event loop can handle pretty impressive loads with a single process. However, most servers have multiple processors, and I, for one, would like to take advantage of them. Node’s cluster api can help.

While cluster is a core api in node.js, I’d like to incorporate it with Express 3 and Socket.io.

Final source code available on github

The node cluster docs gives us the following example.

cluster = require("cluster")
http = require("http")
numCPUs = require("os").cpus().length
if cluster.isMaster
  i = 0
  while i < numCPUs     cluster.fork()     i++   cluster.on "exit", (worker, code, signal) ->
    console.log "worker " + worker.process.pid + " died"
else
  http.createServer((req, res) ->
    res.writeHead 200
    res.end "hello world\n"
  ).listen 8000

The code compiles and runs, but I have not confirmation that things are actually working. I’d like to add a little logging to confirm that we actually have multiple workers going. Lets add these lines right before the ‘exit’ listener.

  cluster.on 'fork', (worker) ->
    console.log 'forked worker ' + worker.id

On my machine, we get this output:
coffee server
forked worker 1
forked worker 2
forked worker 3
forked worker 4
forked worker 5
forked worker 6
forked worker 7
forked worker 8

So far, so good. Lets add express to the mix.

cluster = require("cluster")
http = require("http")
numCPUs = require("os").cpus().length
if cluster.isMaster
  i = 0
  while i < numCPUs     cluster.fork()     i++   cluster.on 'fork', (worker) ->
    console.log 'forked worker ' + worker.process.pid
  cluster.on "listening", (worker, address) ->
    console.log "worker " + worker.process.pid + " is now connected to " + address.address + ":" + address.port
  cluster.on "exit", (worker, code, signal) ->
    console.log "worker " + worker.process.pid + " died"
else
  app = require("express")()
  server = require("http").createServer(app)
  server.listen 8000
  app.get "/", (req, res) ->
    console.log 'request handled by worker with pid ' + process.pid
    res.writeHead 200
    res.end "hello world\n"

At this point, I’d like to throw a few http requests against the setup to ensure that I’m really utilizing all my processors.
Running (curl -XGET “http://localhost:8000&#8221;) 6 times makes the node process go:

request handled by worker with pid 85229
request handled by worker with pid 85231
request handled by worker with pid 85231
request handled by worker with pid 85231
request handled by worker with pid 85227
request handled by worker with pid 85229

Alright, last step is getting socket.io involved. Just a couple extra lines for the socket, however, we’ll need to add a basic index.html file to actually make the socket calls.

cluster = require("cluster")
http = require("http")
numCPUs = require("os").cpus().length
if cluster.isMaster
  i = 0
  while i < numCPUs     cluster.fork()     i++   cluster.on 'fork', (worker) ->
    console.log 'forked worker ' + worker.process.pid
  cluster.on "listening", (worker, address) ->
    console.log "worker " + worker.process.pid + " is now connected to " + address.address + ":" + address.port
  cluster.on "exit", (worker, code, signal) ->
    console.log "worker " + worker.process.pid + " died"
else
  app = require("express")()
  server = require("http").createServer(app)
  io = require("socket.io").listen(server)
  server.listen 8000
  app.get "/", (req, res) ->
    res.sendfile(__dirname + '/index.html');
  io.sockets.on "connection", (socket) ->
    console.log 'socket call handled by worker with pid ' + process.pid
    socket.emit "news",
      hello: "world"

 

<script class="hiddenSpellError" type="text/javascript">// <![CDATA[
src</span>="/socket.io/socket.io.js">
// ]]></script><script type="text/javascript">// <![CDATA[

// ]]></script>
 var socket = io.connect('http://localhost');
 socket.on('news', function (data) {
 console.log(data);
 socket.emit('my other event', { my: 'data' });
 });
// ]]></script>

When I run this code, problems start to appear. Specifically, the following message shows up in my output

warn – client not handshaken client should reconnect

Not surprisingly, we have issues with sockets appearing disconnected. Socket.io defaults to storing its open sockets in an in-memory store. As a result, sockets in other processes have no access to the information. We can easily fix the problem by using the redis store for socket.io. The docs we need are here.

With the redis store in place, it looks like this:

cluster = require("cluster")
http = require("http")
numCPUs = require("os").cpus().length
RedisStore = require("socket.io/lib/stores/redis")
redis = require("socket.io/node_modules/redis")
pub = redis.createClient()
sub = redis.createClient()
client = redis.createClient()
if cluster.isMaster
  i = 0
  while i < numCPUs     cluster.fork()     i++   cluster.on 'fork', (worker) ->
    console.log 'forked worker ' + worker.process.pid
  cluster.on "listening", (worker, address) ->
    console.log "worker " + worker.process.pid + " is now connected to " + address.address + ":" + address.port
  cluster.on "exit", (worker, code, signal) ->
    console.log "worker " + worker.process.pid + " died"
else
  app = require("express")()
  server = require("http").createServer(app)
  io = require("socket.io").listen(server)
  io.set "store", new RedisStore(
    redisPub: pub
    redisSub: sub
    redisClient: client
  )
  server.listen 8000
  app.get "/", (req, res) ->
    res.sendfile(__dirname + '/index.html');
  io.sockets.on "connection", (socket) ->
    console.log 'socket call handled by worker with pid ' + process.pid
    socket.emit "news",
      hello: "world"
Tagged , , , ,

Code School’s “Try R”

I feel like 2013 holds a lot of data analysis for me, so I’d like to start the year off by learning a language that excels at statistical analysis and visualization.  Enter R, a language that has gotten quite popular over the past few years.  In the interest of expanding my horizons, I decided to try to learn this using Code School’s Try R course.  Code School courses can be a little simplistic if you have programming experience, but since I haven’t ever looked at R, it seems appropriate.

Lesson One:  Using R

The first lesson covers basic variable assignment, functions, and expressions in the REPL environment.  Pretty simple for anyone with a programming background, but it does introduce the somewhat unusual assignment operator in R:

x <- 42

This is going to prove a little confusing, for me, as I’ve recently been using coffeescript a lot, with its -> operator for defining functions. I keep swapping the two operators during the lesson.

Lesson Two:  Vectors

Now we are getting somewhere.  In order for me to do any statistical analysis, I’m going to need some data structures.  Vectors are the fundamental one-dimensional list in R.  Codeschool does an excellent job in this lesson of moving into data visualization early and seamlessly.

> vesselsSunk <- c(4, 5, 1)
> barplot(vesselsSunk)

To be honest, I’ve never used a language before with a barplot function in the core language. At this point in the lesson, I’m pretty excited to keep going. Lesson 2 covers vector math and plotting.

Lesson Three: Matrices

Moving onto two-dimensional data sets.  I can almost feel the correlation coefficients and multiple regressions in my near future.

It turns out that this is kind of an odd chapter.  We look at basic matrix construction and manipulation.

# Construct a matrix
> elevation <- matrix(0,3,4)
     [,1] [,2] [,3] [,4]
[1,]    0    0    0    0
[2,]    0    0    0    0
[3,]    0    0    0    0

#Edit a value
elevation[2,2] <- 1

I suppose the lesson is successful in showing how to use matrices, but I don’t feel that it imparts much insight into the language. Similarly, we are introduced to contour, persp, and image functions, however, they remain fairly magical at the end of the lesson.

Lesson 4 – Summary Statistics

Mean, Median, standard deviation.  This one took about 2 minutes to complete, but is obviously very important if you never took statistics.

Lesson 5,6 – Factors & Data Frames

R’s Factors and Data Frames provide nice ways to group and categorize data.  Once you understand factors, you can group a set of users by age, or other distinguishing characteristics.  I really enjoyed these lessons, very practical.

Lesson 7 – Real-World Data

Great finish for the lessons, bit of analysis on real world software piracy data.  We finally got an example of data correlation using R, which I’m pretty excited to use in some data sets I’m looking at.

Conclusion

I’d recommend the Try R course to any developer who is interested in data analysis and visualization.    I’d REALLY recommend the course for anyone with an interest in statistics and data analysis who doesn’t know anything about programming.  It really is that easy.  Good work Code School.

Redis Performance – Does key length matter?

I’m currently building a project using redis as a high performance cache in a node.js application (using the excellent node_redis). My key values will be fairly large ( between 512b and 1kb). The Redis documentation doesn’t specifically warn against keys of this size, but it still seems appropriate to do a benchmark, and see how Redis reacts to large keys (and whether or not 1k is really a large key, or just par for the course).

Test Script (source)

Basically, we insert 1000 records into redis, each with a 10,000 character value. After the writes are all complete, we read each key back from redis.

redis = require &quot;redis&quot;

randomString = (length) -&gt;
  chars = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
  result = &quot;&quot;
  i = length

  while i &gt; 0
    result += chars[Math.round(Math.random() * (chars.length - 1))]
    --i
  result

writeTest = (keyLength) -&gt;
	console.log &quot;1000 set statements for #{keyLength} character keys&quot;
	keys = []
	for x in [1..1000]
		keys.push randomString(keyLength)
	startTime = new Date().getTime()
	for x in keys
		client.set x, randomString(10000)
	client.quit -&gt;
		console.log &quot;1000 keys inserted in #{new Date().getTime() - startTime} ms&quot;
		readTest(keys)

readTest = (keys) -&gt;
	client = redis.createClient()
	startTime = new Date().getTime()
	for x in keys
		client.get x
	client.quit -&gt;
		console.log &quot;1000 keys retreived in #{new Date().getTime() - startTime} ms&quot;

client = redis.createClient()

client.flushdb -&gt;
	writeTest(20000)

This test was performed for key lengths of 10, 100, 500, 1000, 2500, 5000, 7500, 10,000, and 20,000 characters. Three runs of each were performed to avoid any fluke results. Without further ado, the results.

Write Performance (in ms)

Key Length Run 1 Run 2 Run 3
10 1235 1216 1259
100 1231 1242 1223
500 1283 1240 1270
1000 1277 1317 1345
2500 1318 1279 1294
5000 1376 1391 1386
7500 1223 1204 1265
10000 1220 1252 1235
20000 2065 2014 2016

Read Performance (in ms)

Key Length Run 1 Run 2 Run 3
10 43 41 51
100 45 45 43
500 60 54 58
1000 69 73 79
2500 97 101 102
5000 113 114 110
7500 134 133 136
10000 147 156 151
20000 244 234 241

Not surprisingly, as the key length increases, times do increase.  However, write times are relatively unaffected by key length, while read times seem to be impacted more.   To put it in perspective:

  • Key length 10 – an average write takes 1.24ms, an average read takes 0.045ms
  • Key length 10,000 – an average write takes 1.24ms, an average read takes 0.15ms

Whether or not this is significant is really up to you, however, for my purposes, it seems like an insignificant difference.  At the end of the day, redis is a fast and flexible tool for caching data.

Tagged , , ,

Map Reducing the Royals with Mongo

MongoDb is a real lifesaver when it comes to improving developer productivity in web applications, however, that’s only a small part of the power in MongoDb. To do a lot of the deep down data mining, we need to learn to use Map/Reduce to massage our data. Please note, some of this functionality can be accomplished using Mongo’s Aggregate functions, however, I’ve intentionally avoided it, as there are limitations with using aggregates on sharded environments, and I expect most of my Mongo apps will need to be sharded.

Since we just finished the 2012 All-Star Game here in Kansas City, a baseball statistics example seems appropriate.

Setting up your environment
You’ll need to have console access to a mongodb database. To set up mongo on your computer, see the Quick Start.

Loading some sample data
Lets create some realistic baseball stats. I’ll start with the real roster for the Kansas City Royals. However, instead of using their real stats, we’ll generate some random numbers using javascript’s Math object. For example, we know that the best players in the league will get 200 hits, the worst players get none. Math.floor(Math.random()*200) will give us a random number between 0 and 200. We’ll make sure that the number of hits never exceeds the number of At-Bats, and we’ll keep the number of Home Runs capped at 50 (rather generous for the Royals).

To add a single player, we can run the following javascript:

db.Player.save({ 
  number : 47, 	
  name: 'Nathan Adcock', 
  hits : Math.floor(Math.random()*200), 
  ab: Math.floor(Math.random()*300)+200, 
  bb:Math.floor(Math.random()*50)+5, 
  hr: Math.floor(Math.random()*50)
});

Grab the script for the whole roster here, and run it in your mongo console.

Counting Home Runs
Confirm that you’ve got the data loaded. Your stats for Billy Butler will vary (my Billy Butler kind of sucks), but you should always have 43 players.

>db.Player.count();
43
>db.Player.find({name: 'Billy Butler'});
{
  "_id" : ObjectId("50021639b5145ef5327c66b2"),
  "number" : 16,
  "name" : "Billy Butler",
  "hits" : 66,
  "ab" : 386,
  "bb" : 5,
  "hr" : 5
}

We now know how many home runs Billy Butler hit this season, but let us say we want to find the number of home runs that the combined Royals roster hit this season.

var map = function() {
	emit( { }, { hr: this.hr} );
};

var reduce = function(key, values) {
	var sum = 0;
	values.forEach(function(doc) {
    	sum += doc.hr;
  	});
	return sum;
};


homeRuns = db.runCommand( {
     mapreduce: 'Player',
     map: map,
     reduce: reduce,
     out: 'totalHomers',
     verbose: true
} );

db[homeRuns.result].find();

A more complex example

Cool huh? Lets take a slightly more complicated case. We’d like to take all players with more than 250 AB, and group them by batting average.

var map = function() {
	var ba = this.hits /this.ab;
	if (ba < .250) {
		key = '< .250';
	}
	if (ba > .250 && ba < .300) {
		key = '.250 -> .300';
	}
	if (ba > .300) {
		key = '> .300';
	}
	emit(key, { count : 1});
};

var reduce = function(key, values) {
	var sum = 0;
	values.forEach(function(doc) {
	  sum += doc.count;
	});
  	return sum;
};

ba = db.runCommand( {
    mapreduce: 'Player',
    map: map,
    reduce: reduce,
    query: {"ab": {$gt: 250}},
    out: 'battingAverages',
    verbose: true
} );

db[ba.result].find();

Source Code

These examples are pretty simple, but we can still take away a few lessons:

  • Do the heavy lifting in the map function. These are the tasks that get executed in parallel across your shards. For example, by pushing the batting average calculation, and the categorization into the map function, we ensure a fast runtime across a large dataset.
  • Make use of the query arg for the map/reduce command. By filtering out the undesireable data, we save mapping operations and reduce the load on the db

Credits:
Thanks to several bloggers who helped me understand this concept:

Full source code is available on github.

Tagged , ,

5 ways to Play! poorly

As readers may be aware, I’m really into the Play! framework. It combines the convention-over-configuration mindset of ROR with the Java/Scala libraries and skills that I’ve worked on for years.

That said, Play! isn’t idiot-proof. Here is a list of 5 mistakes to avoid in your Play! projects.

1) Using Modules too liberally

Play modules are great, but they can lead you to develop your projects as if you were still in the same Java world you’d always been in.  For example, when developing a REST api in Play, I started by installing the RestEasy Module  It seemed like a no-brainer – an easy way to implement JAX-RS services.  However, after a bit of development, it became clear that one can obtain all the RESTful goodness by simply using Play routes  There is no reason to deal with increased build times, increased deployment size.

If you need the functionality of a specific module, by all means, include it.  However, spend the extra 5 minutes to see whether the functionality you need is provided in core Play first.

2) Not considering the Scala/Java decision closely enough

This is most relevant if you are working in Play! 1.x.  If you are in Play 2.0, its easy – use scala, please.  However, it can be appealing to use Java for play 1.x projects.  However, I tend to think that it is a bad idea.  The Play! development team is clearly moving towards Scala as the primary language, and it seems like a good idea to start with Scala in your 1.2 projects now.  It should make the 1.2 to 2.0 upgrade that much easier when you are ready.  I expect to have another entry for my readers in a few weeks with some tips on the 1.2 upgrade process.

3) Building a War

Any Java developer knows this cycle.  Build an application, package it into a war, deploy the war to an application server.  Play allows you to use this familiar cycle, and they support lots of popular app servers.  That said, think carefully before you go this route.  The play server and play run commands bring up a standalone server using Netty.  It is fast, simple, and perhaps most importantly, ensures that you are running your application in the same method for both development and production.

4) Fail to worry about Template Performance

Play is fast.  The JVM is a very optimized platform, and play takes advantage of that.  However, play has its Achilles heel –  Groovy Template rendering. I know it, you know it, the play dev team knows it.  Luckily, there are a lot of easy ways to fix this problem is performance is a concern in your app.

  1. Try one of the many alternate templating engines.  FasterGtJapid or Rythm all provide much better speed.
  2. Try Play! 2.0.  The template system has been moved to Scala, which shouldn’t have any of the performance issues.
  3. Go Client-Side.  Play! makes a great REST server, and if you combine it with a client-side MVC framework like jmvc or backbone, you can avoid the overhead of server-side template compilation.

5) Using Getters & Setters

Play! provides a wonderful piece of functionality called properties simulation.  I won’t rehash the details here – in short, Play! allows us to get away from one of Java’s most annoying flaws.  Languages that we’ve come to love such as Ruby, Python, and JavaScript all make do without getters & setters.  Try as I might, I can’t think of any compelling reason that we should use them for every little property in our Java code.  I know it can be a little uncomfortable at first to buck the Java conventions and make your members public.  Trust me, once you get used to seeing those concise, clean models, you’ll love it.

There you have it, 5 Play! framework anti-patterns.  Please comment if you have any anti-patterns of your own to share.

Tagged , ,

Java YAML Shootout – SnakeYaml vs YamlBeans

After spending a couple months developing LiveOn using the Play! Framework, I’ve grown increasingly intolerant of other Java frameworks.  While I’d used YAML before in Rails & Python, Java frameworks usually ignore YAML in favor of XML for configuration.  The creators of Play! realized that XML sucked, and implemented their dependency, database, and routing configurations as YAML.

So, in the spirit of helping spread YAML around the Java world, I’ve taken a look at a few YAML libraries.  Lets see which one stacks up as the best bet for adding YAML support to your Java project.


The contenders

Note: I’ve ommited JYaml.  While it is an early implementation, and still a useful library, it is no longer maintained by its creator, and therefore not a realistic candidate.

The full code sample can be found here:  https://github.com/adamnengland/yaml-shootout

Basic Usage  – For simple usage, its pretty much a wash.  Examples in both libraries look almost identical

SnakeYaml

@Test
public void testSimpleConfigurationLoad() throws Exception {
  InputStream input = new FileInputStream(new File("src/test/resources/simpleConfig.yml"));
  Map data = (Map) Yaml.load(input);
  assertEquals("yaml-shootout", data.get("simple"));
}

YamlBeans

@Test

public void testSimpleConfigurationLoad() throws Exception {
  YamlReader reader = new YamlReader(new FileReader("src/test/resources/simpleConfig.yml"));
  Map data = (Map) reader.read();
  assertEquals("yaml-shootout", data.get("simple"));
}

Parsing to a Bean – Once again, not much difference here.

SnakeYaml

@Test
  @Test
  public void testAdvancedConfigurationLoad() throws Exception {
    InputStream input = new FileInputStream(new File("src/test/resources/advancedConfig.yml"));
    YamlConfig data = (YamlConfig) Yaml.load(input);
    assertEquals("yaml-shootout", data.simple);
    assertEquals("gmail", data.advanced.type);
  }

YamlBeans

  @Test
  public void testAdvancedConfigurationLoad() throws Exception {
    YamlReader reader = new YamlReader(new FileReader("src/test/resources/advancedConfig.yml"));
    YamlConfig data = (YamlConfig) reader.read();
    assertEquals("yaml-shootout", data.simple);
    assertEquals("gmail", data.advanced.type);
  }

Extra Features

  • Both Libraries support serialization (with a similar, simple syntax)
  • Both Libraries support deserialization into maps, lists, and Strings
  • Both Libraries support Tags & Shortcuts
  • SnakeYaml supports YAML’s merge specification.

Community

Here, SnakeYaml starts to get the edge.  Both SnakeYaml and YamlBeans are hosted on google code.

SnakeYaml YamlBeans
Starred By 118 users 24 users
Commiters 3 1
Google Group Messages 630 57

Finally

Well, I have to say, I’m a little disappointed in the outcome. I’d hoped for something a little more controversial when I started this experiment, but it appears that SnakeYaml & YamlBeans are both excellent yaml frameworks for java, with a lot of the same syntax. However, at the end of the day, SnakeYaml gets the edge for a larger feature set, and a much more active community

Tagged ,
%d bloggers like this: