Qi Qiao Ban: Classpath Scan

It seems to be common sense with so many developers, that "expensive" things that only happen once are not that bad. And when those things happen at the wrong point in time that it is a good idea to move them to the "application startup".
So most of us are not very surprised and consider it no problem, if this startup takes some time.

Welcome to the Cloud

At a first sight, deploying my applications in the cloud is exactly the same as it used to be. I follow accepted standards, use common frameworks and respect many best practices. So any runtime platform supporting this should do the job. Cloud in that case means, that many more aspects of deployment and operations get automated - including that starting and stopping of additional instances, load balancing and so on.
But wait... When does the platform get the idea to start new instances?

Let the Customer wait?

It's just load. And load means, that Customers issued HTTP Requests and are awaiting responses in time. From research we know that in time means something around 3s. But when a user request leads to the start of a new instance to handle the load it's quite too late to do all the expensive stuff. Application startup is not the right point in time anymore.
Additionally you can learn, that all those nice precomputations simply re-calculate values the instances from yesterday or the other instance on the other machine already learned. And of course it still is a good idea to have all those values at hand - meaning to "cache" them. But very many of the values which are not coded or configured into the application deployment stay the same as long as the deployment environment doesn't change. Or they stay the same unless the application itself changes them and thus is able to tell when really to re-calculate derived values in the caches.

Examples for all this are:

Database derived value caches like query results with calculations based on the result, where just this application writes to the database. Those values might be needed at startup but stay constant until the application itself changes the database contents.
Classpath scanning to auto-discover software components. The developers and deployers wont want to collect the lists manually or at build time but the components don't change after the deployment. And definitely not on every startup.
Webtemplates and other codes which need to be fetched and prepared for use (e.g. get compiled). Those codes get prepared on every startup of the application while they change not that frequently and especially not on application startup. Obviously they add a big amount to the the response time since those codes need to be executed for the generation of the response.

Just use a Cache

What? Oh, well not that Cache. This Cache. This cache doesn't need to be as fast as the in memory caches already available and I didn't want to re-invent the wheel. They just need to have some values at hand some other instance already calculated and they have to be changed when these values change - which is - as already pointed out - not as frequent as other runtime values.
The values in the cache still are volatile and can be re-calculated by the application at any time. But they should be persisted for some time to be available at startup and reduce the time for the first request in web applications.
The jsr107 cache implementation in the Google App Engine is an example of exactly this approach. I wrapped this cache as a PersistentRestartCache in Tangram and cut reponse times for the first request to a third (or half as the worst case scenario) from around 30s to 11s. For all other platforms available for the Tangram dynamic webapplication framework I presented a simple (maybe too simple) file based implementation which does most of the job as well.
Still this leaves out the classpath scanning of the Springframework or dinistiq. In my environment the Springframework uses between 3s and 7s and dinistiq around 4s to do the basic application setup based on classpath scanning and additional configuration files. So half of the time the startup deals with the framework code and not the applicaiton startup itself. And this value only was achieved by nearly brutal reduction of the portion of the classpath to be scanned creating a "components" package where all autoscanned components reside for the Springframework and dinistiq.

Resume

Caches are a good Idea. Applications with a dynamic deployment are a good idea. Thinking of the application startup as a point in time where long calculations might take place is not that much of a good idea (at least anymore) where instances should be automatically braught up depending on load and not human decisions.
So simply bring together caches with persistence and knowledge when those caches can be invalidated. As an example I did this for the tangram web application framework and had quite some success on the Google App Engine, run@cloudbees, and OpenShift cloud platforms for the Java world.
The last but also important point: Optimizing the application startup in general is worth the time nowadays.

Together with the missing support for the Servlet 3.0 specification in Google App Engine (and, yes, there are still too many situations were this specification level ist not available) we are reading from Google for their App Engine that classpath scanning is an issue on that platform and that this is one of the reasons that kept them from supporting this version of the servlet specification.
What I didn't read from Google is, that they noticed that the Servlet 3.0 version is (arguably) one of the most important steps since Java came to the web. There is no major framework in the Java world anymore not doing any classpath scanning right now. The mentioned Springframework I'm using is just one example.
After some five years of work with the Google App Engine and Java as the only language in use, I changed my mind and don't consider the App Engine as one of the first choices in cloud platforms for the Java world anymore.
And this is really just because of these two small issues.
Like very many others I'm using the Springframework - with component scan and thus with classpath scanning. Any first request to an instance take ages. But "first requests" are common in the cloud, where instances need to be shut down and brought up depending on load. And the cloud is what Google App Engine is about, isn't it?
I invested quite some effort to learn how to be fast on the first request while still using the Springframework and even developed my own stripped down, minimalistic Dependency Injection environment for the application setup (dinistiq) to only have the features I'm using at hand. But this all didn't help to make the end user experience satisfying. Things feel slow.
So in the end this all gave the push for Tangram to support that many new platforms, use the CI Features and Repositories at cloudbees, enjoy the command line access of OpenShift. This brought options of different frameworks and I learned much about cloud deployment and operation scenarios. So in that respect we should be thankful for the Google App Engine weakness.
But it still is the source of some level of complexity and number of artifacts flying around in what I call my dynamic web application framework Tangram. Also this currently renders Google App Engine the second best cloud platform for Java while still having great web based monitoring tools.

Qi Qiao Ban

Donnerstag, 24. Juli 2014

Fast on the first Request

Welcome to the Cloud

Let the Customer wait?

Just use a Cache

Resume

Dienstag, 15. Juli 2014

Google App Annoyance - aka Engine