I spent this Monday at the CommunityOne even priority to JavaOne at the Moscone center in San Francisco.  A few thoughts on the event:

  • KeyNotes are so passe:  if you’re not going to be announcing something fantastic, you’re just making too many people sit too close together for no reason.  And, for gawd’s sake, make sure your demos are going to work.
  • Lightning presentations are great for sparking new ideas: get the ideas out there as fast as you can, let other people think of ways to use them if they want to.

A few of the interesting technology ideas I’ll be playing with after seeing them today (go, go HackDays)

  1. JavaDB+Comet: in browser through java plugin extensions, and in other view clients
  2. WebSynergy
  3. Comet + Scala

So, I was quite surprised to see the following listed in my System Updates for my MacBook Pro today:

java version "1.6.0_05"
Java(TM) SE Runtime Environment (build 1.6.0_05-b13-120)

Apparently I’m not the only one to have seen this.
Interesting that, since I don’t exactly live under a rock, I didn’t see any announcements for this. But it’s great to have none the less.

Since I’m going to JavaOne next week, this was quite fortunate. All of the labs I’ve signed up for require Java 6 and I haven’t checked the developer preview in a while and was wondering about its JavaFX support.

It auto updated the WebStart client, letting me run the JavaFX webstart demos from java.net for the first time

And, installing this update + NetBeans 6.0.1 + JavaFX Plugin lets the great JavaFX demos and tutorials for NetBeans run out of the box!!!

So, I wrote a little while back on some great ways to kill performance when using hibernate. Obviously the best (as in most successful, not most recommended) way to come to these conclusions is to experience the pain from making these mistakes.Being the good little refactoring shop that we are, we took it as a challenge to clean this up and see how things improved. First a little background on a particularly problematic area of the system:

  • Basic Hibernate DAO, ensures that all needed nodes in an object graph are initialized before sending the graph to business and presentation components.
  • The graph in question is made up of 4 object classes:

Batch Cardinalities Average cardinalities for each relationship are noted below the role names in the diagram.

  • Relationships ‘j’, ‘k’, and ‘l’ are managed using property-refs/unique keys instead of primary key relationships.
  • Relationship ‘m’ is mapped as an ‘any’ relationship.

Because of Hibernate’s inability to batch or proxy objects using these strategies, and because each cardinality compounds on the previous ones this model results in the following actions to load this data:

  1. run the original query for A’s
  2. for each A, query for all related B’s
  3. for each B, query for all related D’s
  4. for each B, query for all related C’s
  5. for each C, query for all related A’s

This results in 1(j)+100(m)+100(k)+1000(l)=1201 individual queries for each root object retrieved!!! If I end up with 100 records coming back in my initial search, that’s 120,100 queries even if I only wanted the A’s returned from the initial query!!!Convert these relationships to regular primary relationships (through simple DML updates for the property-refs and a little ORM hierarchy for the “any”), set some reasonable batch fetch sizes (let’s assume 100 for this example) and the flow changes to:

  1. run the original query for A’s
  2. for each 100 A’s, query for all related B’s
  3. for each 100 B’s, query for all related D’s
  4. for each 100 B’s, query for all related C’s
  5. for each 100 C’s, query for all related A’s

1(original search for A)+1(j)+1(m)+1(k)+10(l)=14 queries to get the entire object graph, or 1 query if all I care about is the root nodes.ConclusionA combination of proxying and batch fetching changed the load characteristics dramatically:

Task Queries without batching and proxies Queries with batching and proxies
Load A’s 120,100 1
Load graph of 4 Objects 120,100 14

For our system, making this change brought the load time for a single view component down from the 10’s of seconds to hundredths of a second.

I’ll start off by saying, I think proxies are beautiful things. Spring’s dynamic proxies are extremely useful for AOP style tasks, things like dynamic transaction management, they’re also used very effectively in Hibernate for performance optimizations for graph clipping (making sure that you get only the data you actually want back from the database instead of loading the whole system at once).Both Spring and Hibernate use CGLIB for their more difficult proxying (class proxying vs the arguably simpler interface proxying). If you’re using either of these frameworks and their proxying capabilities, here’s a couple of things that might save you some pain and anguish.

  1. Provide a public, no argument constructor for any objects you might want to create a proxy for Although CGLIB can create proxies for classes that take constructor arguments, it requires a lot more work. Private constructors are a different matter. When creating a proxy for a class, CGLIB dynamically generates a proxy class that is a subclass of the class you are proxying. That means that if the parent’s constructor is private you’re SOL.
  2. Beware of final methods A core language construct, final implies that no subclass can override the behaviour of the parent class. Because CGLIB generates a new subclass of the class to be proxied and adds advice to it, it can’t proxy method calls on a final method. This means they will fall through to the base class and possibly give you unexpected behaviour if you expect the proxy to intercept all methods. Consider the following contrived example:
    public class AbstractParent {
    	private final String name;
    	public AbstractParent() {
    		this.name = “parent”;
    	}
    	public AbstractParent(String name) {
    		this.name = name;
    	}
    	public final String getName() {
    		return this.name;
    	}
    }
    public class ChildOne extends AbstractParent {
    	public ChildOne() {
    		super(”childone”);
    	}
    }


    Now if you create a proxy of type AbstractParent, that actually delegates to an instance of ChildOne, calls to proxyForChildOne.getName() will always return “parent” since a final method can not be proxied.

  3. Hibernate Inheritance Strategies generate proxies for the common ancestor for a relationship This one might seem obscure, but it’s important. If you use Hibernate to map inheritance strategies, any relationships to the parent object will proxy an instance of the parent, not of the concrete child.Consider the following set of objects:
    Simple Inheritance Diagram

    The relationship from RootObject must go to one of either ChildOne or ChildTwo, but if the relationship in question is set as “lazy” (lazy=”proxy” in the case of a One-to-One or Many-to-One relationship), Hibernate can’t know what the concrete implementation will be until it initializes the object, so it’s type will be a subclass of AbstractParent. When the relationship is initialized, the Proxy will still remain, but delegate to the concrete object that it proxies for. The key phrasing being “the Proxy will still remain”.

    Simple Inheritance Object Model

    Now, any calls to
    ChildOne.class.isInstance(RootInstance.getRelatedObject())
    will return false. The only way to remedy this is to rewrite it as

    ChildOne.class.isInstance(Hibernate.getClass(RootInstance.getRelatedObject()))

A lot that’s blog worthy has been going on lately with our entire development team going to NFJS this past weekend in Seattle as well as an upcoming hack day next week. I promise to write on that later. To bribe you for your continued readership, I provide the following “Face Hack”, a more physical representation of what Hack Day means to me.
Spaghetti Hack
More available here

So, I guess I’m a little slow, but Google Desktop is now available on Linux
IMO, this goes a long way to making desktop linux usable to the masses.
For those that are are just starting with linux, it provides a fantastic means to find information on your system, a little ctl-ctl and you’re viewing perfectly formatted man files, browsing through open office documents or, by adding your common PATH entries to your search path, firing off applications without adding a launcher anywhere.
Beautiful…

Jul

04

So, we’re on to another hack day at my company. No I don’t work at Yahoo, we just borrowed the idea. I’m really excited about some of the prizes this year, care of everyone’s favourite, thinkgeek. What prize am I hoping for? Probably the shocker tanks. Eat your heart out mike

So, we’ve been using Hibernate in our product development for a little over a year now, and I thought I’d share a couple of pearls on how to scuttle a Hibernate project. Fortunately, our team has gotten better at adapting quickly and knew when to cut and run on each of these issues.
So, if you are looking for ways to make your Hibernate implementation go poorly here goes:
Depend heavily on filters
Filters, seems like a really cool idea doesn’t it? Being able to inject limitations into your queries on the fly to limit the views people get. But wait, it doesn’t seem to apply the same way to all queries you say? Oh ya, little bug there, but don’t worry, we solved it by removing the functionality
Make every relationship an ‘any’ relationship
Sure, any relationship in your system CAN be an ‘any’ relationship (meaning it can join to any table instead of a specific referenced table). Who cares about foreign keys, who needs referential integrity. For thse of you that love making yourselves indispensible, this is a great way to ensure youre data model is understood only by yourself and you will always be employed dealing with orphaned data and cleanups.
Use lots of ‘property-refs’
Hibernate doesn’t deal with property-refs nearly as well as it does regular PKs. It has problems creating proxies for them, it can’t batch load them… Meaning, if your goal is to dramatically decrease your performance, property-refs to the rescue!
Use lots of HQL WHERE clause joins
Normally, Hibernate requires that any relationships you want to join on be mapped as Hibernate relationships. You can get around this by joining in where clauses (like SQL). Unfortunately, this often makes it much harder to tune your queries. So if you’re goal is to… you know the rest.

So I work on what can only be described as an Enterprisey system. Some of us, like my good friend over here, have been making an effort of late to make the system cleaner, faster and more maintainable.
Previous attempts at performance improvements in house have spawned a series of Hash[Map|Set]s. Sets of updated objects in change logs, Maps of keyed singletons for fast lookups etc.
We’ve had a couple of lessons forced upon us lately in this space that fall into 2 major categories:
The classic Enterprise HashMap:

  • If you’re going to key singletons, make sure they really are singletons. Are you going to ever going to be running part of the system in more than one VM? Can other components update your data?
  • Keep a common caching location and paradigm. 32 different caches implemented different ways and living in different parts of the codebase is very difficult to debug and test.
  • If you could end up caching a lot of entities, for the love of god make sure the contents can be garbage collected (both by using a WeakHashMap AND by making sure your clients don’t expect that things they’ve put in the Cache will always be there)
  • Any static HashMap in a class should be considered a cache of its own. Think about how this is going to be refreshed before peppering it throughout your code

We’ve also been burned a couple of times on Hashing in the more conventional sense which has taught us the following about high cardinality HashSets (and HashMaps for that matter, since the implementation is identical in Java):

  • Don’t neglect your .hashCodes(). If your policy is to return a constant as the fall-through case on a .hashCode() expect serious performance problems as te number of those elements increases in the Set. They will all end up in the same hash bucket meaning every subsequent addition will have to compare against every previously added entity. Which brings us to:
  • Don’t neglect your .equals(). If you end up with a lot of elements in your Set, possibly because of the above, or any other reason, you better be prepared to have your objects compare against each other. Remember, the whole point of a Set over the much simpler (and usually much faster) List, is to ensure that only one instance of something exists in the Set. What does that mean? Right… .equals()
  • Use the best method availabl for the task you’re doing. That means avoid .clear() and .addAll() on HashSet if you have to. Doing a .addAll means that everything you are adding needs to be compared against everything you already have AND everything else you are adding. We got a 50% performance improvement by replacing an .addAll() with a X= CollectionUtils.subtract(A,B); .addAll(X)! If you don’t know what method you should be doing, find yourself a profiler

Apr

02

So, there’s a few of us at work these days that have started looking at dead code in our system. We run into all the usual culprits: functionality only half removed, code only exposed for the sake of unit tests, complexity added for the sake of “Patterns” or just cause it seemed cool.
I’ve seen some pretty fantastic YAGNIs in the last couple of weeks. Some of them I can honestly say I added, or was at least involved in the creation of. So when I saw this today, it really made me laugh. A RemoteObjectProvider used only by unit tests has NOTHING on elevator equipment in the off chance you want to add a door…
The lift to nowhere