Musings of the lazy developer

Friday, January 3, 2014

Sleeping for a negative interval in perl

Stumbled on this behavior accidentally when walking through some buggy Perl code. The gist is:

> perl -e 'sleep(1);print "done\n"'
done
> perl -e 'sleep(-1);print "done\n"'

... never completes!

I tried to trove through the perldoc and the underlying linux man page for sleep but couldn't find an explicit explanation. I guess because the underlying sleep method in unistd.h accepts unsigned int, perl's negative 1 is interpreted as 32-bit unsigned integer which is 2^31 - waiting for 2^31 seconds - which is a loong time making us think that it is never completing. To put this theory to test, we tried out something like:

> perl -e 'sleep(-0.0000000000000000000000000000001);print "done\n"'
done

and it completed! Hope this helps someone else as well!

EDIT:

Looks like the fractional example is not a good one - looking through Perl's source code, it plainly seems that sleep is modeled as a macro:

#define PerlProc_sleep(t) sleep((t))

So the floating point expression was being plainly truncated. A better test case would be 2^32 - which can't be fit in a 32 bit signed integer (the default chosen) and when overflowing would be interpreted as 1 and so on.

> date ; perl -e 'sleep(4294967296);print "done\n"' ; date
Mon Jan  6 02:12:27 EST 2014
done
Mon Jan  6 02:12:27 EST 2014
> date ; perl -e 'sleep(4294967297);print "done\n"' ; date
Mon Jan  6 02:12:34 EST 2014
done
Mon Jan  6 02:12:35 EST 2014
> date ; perl -e 'sleep(4294967298);print "done\n"' ; date
Mon Jan  6 02:12:42 EST 2014
done
Mon Jan  6 02:12:44 EST 2014

EDIT 2:

A colleague of mine pointed out a couple of things. A more straight-forward way of confirming the hypothesis is via strace. Firstly, we need to know the underlying Linux system call that is being invoked by the Perl runtime. strace -v on the perl one-liner of sleep(1) will help you identify that it is nanosleep. strace -v -e nanosleep on the perl one-line of sleep(-1) will show that it will indeed try to sleep for 4294967295 seconds. Secondly, Time::HiRes module of Perl, is a better alternative for such sleeps, because it catches these negative usages of sleep interval with a cocky "negative time not invented yet" croak.

References:

Thursday, October 10, 2013

Simple caching with a TTL

Spring's cache abstraction framework is hugely useful for declarative caching as outlined in previous posts. As we start to use more caches, an inherent requirement that arises is periodic and on-demand purging. For the first kind of purge, you require a Time To Live (TTL) to be associated with your cache. External solutions like ehcache provide configuration to be able to do this. There are host of other parameters, like writing to disk, disk location, buffer sizes, max limits that can be configured using the configuration. However, what if your requirement is simpler and don't want to marry into ehcache just yet.

Spring's ConcurrentMapCacheFactoryBean has been made nicely pluggable where you can plug in any backing store that you want to use for the concurrent map based caching. So, here we can plug in our own TTLAwareConcurrentMap. But, I don't want to write TTL logic myself, right? Sure, just use the constructs available from guava. Guava gives a TTL backed map with a CacheBuilder which looks as simple as:

return CacheBuilder.newBuilder().expireAfterWrite(duration, unit).build().asMap();

All we need to do now is create a FactoryBean in Spring that will be injected with the duration (and unit of the duration) of the TTL and it will vend out the store that will be used by Spring's caching framework. Sample FactoryBean isTTLAwareConcurrentMapFactoryBean.

For on-demand spring cache flushes, we can have a JMX operation defined on a custom CacheManager that is injected with the Spring caches at startup. On invocation, the specific cache or all caches may be flushed by invoking the Cache.clear() method. Due credit to this SO question.

Hope this helps!

References:

Wednesday, September 25, 2013

Generating JAX-RS client stubs

In the CXF JAX-WS world, when giving out API's to the clients to use, the steps to be followed were:

Annotate the service and service methods with the @WebService and@WebMethod annotation in the business module
Generate a WSDL corresponding to the given service methods using the java2wsdlutility as part of the maven build of the business module
Define a separate ws-api module that uses the wsdl2java utility to create the client stubs using the WSDL file generated by the business module as an input. The target of this module is what is given to the outside world

How do we do this in a CXF JAX-RS world where though we have a wadl2java utility, we don't have a java2wadl utility - and where concept of WADL by itself is largely contentious?

Suppose, we have an interface that we wish to expose in a JAX-RS way - say SpecialService.java. Now this service interface can expose methods which produce complex java objects which in turn may have other non-primitive fields. The rs-api jar that we wish to give to the clients of this interface should contain only the relevant objects required and should be spared from the implementation classes that are used by the business module itself (like done by the ws-api jar in the JAX-WS world). ProGuard is one utility that helps us out here. ProGuard is more known for being a class file obfuscator for distributing android packages (now being superseded by DexGuard) so that clients are not able to reverse-engineer the original logic. However, it is also a pretty nifty optimizer - in the sense that it can do static code analysis to see which class files are needed corresponding to a given starting point. And that is really what we wanted - we want to see which classes are needed given SpecialService class as the starting point.

The configuration is as simple as:

    <configuration>
        <obfuscate>false</obfuscate>
        <injar>../../proguard-tester-business/target/proguard-tester-business-0.0.1-SNAPSHOT.jar</injar>
        <inFilter>!**.xml</inFilter>
        <outjar>${project.build.finalName}.jar</outjar>
        <outputDirectory>${project.build.directory}</outputDirectory>
        <options>
            <option>-dontnote</option>
            <option>-keepattributes</option>
            <option>-keep @javax.ws.rs.Path public interface com.kilo.SpecialService { *;}</option>
            <option>-keepclassmembers class * { *;}</option>
            <!-- Additional classes as needed -->
            <option>-keep public class com.kilo.MixinSetter { *;}</option>
            <option>-keep public class com.kilo.ApplicationParamConverterProvider { *;}</option>
        </options>
    </configuration>

This configuration will create a stub for all the methods in the interface that have been annotated with the @Path annotation and will pull in its dependent classes. Hence, if you have certain methods as part of your interface which are not annotated will not figure in the interface that is being provided to your clients - which is great. But, one should also consider why both methods are part of the same interface if they are serving such varied needs - anyway, that is upto the designers of the interface. Any additional classes needed can also be mentioned and the rs-api jar is thus self-sufficient and directly usable by the client. And no dependency on WADL whatsoever. Sample setup available here.

The ProGuard usage manual has an exhaustive explanation of the different options and I found community support for this good as well.

Next step would be to see how we can get a source jar generated for these client stubs so that debugging would be useful, but that is for another post. Hope this helps!

References:

Thursday, July 4, 2013

Quick embedded db for integration tests

Our implementation of bitemporal library had a sample client. The integration test cases in the client project run nightly to affirm the veracity of the actions. To see the client in action, it needed an underlying table on a dataserver. To get the ball rolling, we initially housed it in the sandbox DB of a dev dataserver. However, it regularly went missing giving us monday morning pains. Next, we decided to move it to one of the databases that we owned on the dev dataserver. However, the monday morning pain changed to a monthly pain coinciding with the data server refresh cycles when the tables on the database would go missing again. We could have considered moving these tables to the production data server, but it would be undesired clutter to a newcomer. As an alternate route, we just pointed it to a local dataserver instance running on one of our colleagues machines. And yes, you guessed it right – when the machine was down for reboots or other maintenance, again failures though we needed to be especially unlucky for that to happen! Here an embedded database seemed to fit the bill nicely

Mixin it up Jackson style

We have already discovered the goodness of Jackson for vending out JSON data in a JAX-RS setup. In many cases, our domain objects have references to other domain objects that are not under our control (e.g. classes from third-party JARs). We can easily add annotations to our domain objects to indicate which fields to ignore, but what do we do for the third party objects? One could choose to write custom bean serializers, but that is pretty onerous on the part of the user and one that requires higher maintenance since it is away from the original class. Here is where Jackson Mixins comes to the rescue.

Know your disk latency numbers

I was looking around for a tool that will help quantify the latency difference between a NFS-mounted location to a local location and came across ioping. As the documentation indicates, it tries to represent disk latency in pretty much the same way as the ping command shows network latency to a desired host.

I had to build it from source for our linux installation - but the process went about without a hitch. I first tried it on my NFS mounted home drive where the checked-out code, eclipse workspace and maven repository reside and is the hotspot for IO activity when developing. Using a sample invocation from the documentation site, I tried it for a chunk of size 1MB, ten times.

$>ioping -c 10 -s 1M
--- /home/kilo(nfs fs1.kilo.com:/vol/home/kilo) ioping statistics ---
10 requests completed in 9101.5 ms, 102 iops, 102.1 mb/s
min/avg/max/mdev = 9.5/9.8/10.0/0.2 ms

Next was to test out the performance of our local partition with the same parameters:

--- /local/kilo (ext4 /dev/ssda) ioping statistics ---
10 requests completed in 9052.4 ms, 201 iops, 200.8 mb/s
min/avg/max/mdev = 4.8/5.0/5.6/0.2 ms

Result was roughly a speed-up of 200%!

For kicks, I tried it out on a /tmp location as well with the same parameters:

--- /tmp (tmpfs none) ioping statistics ---
10 requests completed in 9004.1 ms, 5219 iops, 5219.2 mb/s
min/avg/max/mdev = 0.1/0.2/0.3/0.0 ms

That was roughly 30 times faster - but hold on - what is that tmpfs that is mentioned. Digging further, it seems tmpfs is a special filesystem that doesn't reside on a physical disk but rather stored in the physical memory (albeit volatile) - so the speed-up in access is kind of expected. Which also means that we should be extra nice about what goes in /tmp in such a setup. Keeping a lot of garbage in there and not cleaning up will come and bite us at unpredictable times. Guess many people already know about this

Coming back to the utility itself, there are options to change working set sizes, offsets into files that wish to be seeked, etc. One particularly interesting feature is the option to use write IO, but that option was not available in the 0.6 that I downloaded. An issue indicates that this is a yet to be released feature and will be interesting to see it in action.

I think this will a good utility to have in our linux installations by default. If there are other utilities that do similar stuff but comes out of the box from a stock RHEL installation, please let me know. Hope this helps!

References:

1. https://code.google.com/p/ioping/

2. http://en.wikipedia.org/wiki/Tmpfs

3. https://code.google.com/p/ioping/wiki/man

Monday, May 27, 2013

Lets go uber!

ow do you run your non-webapp maven-based java program from command line? One might be using exec:java route and specify the main class. The only sticking point here is that the classpath will be filled with references to the user’s local repository (usually NFS mounted). I would be much more comfortable if I had all the set of jars that I would be dependent on packaged WAR-style and be ready at my disposal. In that sense, the application would be much more self-contained. It also becomes that much easy for someone to testdrive your application. Hence the concept of an “uber” jar – a unified jar that will house all classes/resources of the project along with classes/resources of its transitive dependencies.

Musings of the lazy developer

Friday, January 3, 2014

Sleeping for a negative interval in perl

Thursday, October 10, 2013

Simple caching with a TTL

Wednesday, September 25, 2013

Generating JAX-RS client stubs

Thursday, July 4, 2013

Quick embedded db for integration tests

Monday, June 24, 2013

Mixin it up Jackson style

Wednesday, June 12, 2013

Know your disk latency numbers

Monday, May 27, 2013

Lets go uber!

About Me

Blog Archive