Danushka's Tech Thoughts: September 2008

Tuesday, September 30, 2008

Cloud Computing is a Trap - RMS

In this recent article, Richard M. Stallman expresses his view on Cloud Computing. According to him, C2 is nothing but a fashion statement. RMS advises users to stay local and stick with their own infrastructure.

My personal view on C2 is that it is more or less a business decision. Getting someone else to provide you with quality infrastructure is as simple as getting a cabby to take you somewhere rather than you drive on your own. Because in the business world, its all about getting rid of overheads. And of course, at the end of the day it is all about money!.

Sunday, September 28, 2008

Zimbra Desktop Client Exposes Authentication Information in Pain Text?

I happened to read this recent blog post on Holden's blog. According to him, Yahoo! Zimbra desktop main client exposes username, password information in plain text. He has discovered this flaw during a Yahoo! 'hacku' day at the University of Waterloo. The following image which was found on the same blog shows how Zimbra sending authentication information in plain text could be observed on Wireshark.

Saturday, September 27, 2008

I Still See this as an Erratum in WS-Coordination Specification

I have been studying the WS-Coordination specification for some time now as I am supposed to implement it in the near future. In sooth, it is a trivial specification. But there is one thing that I am struggling to understand. To be honest, I see this as an erratum.

This diagram on page 11 of the specification shows the behavior of WS-Coordination. In other words, this really is the big picture.

I am totally OK with steps 1 through 3. But according to the explanation on page 10, step 4 and 5 is all about building links between Yb/App2 and Ya/Yb respectively. On page 10 it says,

4. App2 determines the coordination protocols supported by the coordination type Q and then Registers for a coordination protocol Y at CoordinatorB, exchanging Endpoint References for App2 and the protocol service Yb. This forms a logical connection between these Endpoint References that the protocol Y can use.

5. This registration causes CoordinatorB to decide to immediately forward the registration onto CoordinatorA's Registration service RSa, exchanging Endpoint References for Yb and the protocol service Ya. This forms a logical connection between these Endpoint References that the protocol Y can use.

So at the end of the day, Ya talks to Yb and that in turn talks to App2. If that is the case, then "Ya" on the arrow for step 4 should actually be "Yb".

After observing this, I sent a mail to the ws-tx-comment@lists.oasis-open.org list pointing out this. A few days back I got a reply from Ian Robinson, the WS-Tx co-chair, explaining that the figure is simply an illustration of one possible implementation in which step 4 calls step 5 and returns its result. So that means subsequent protocol level communication happens between Ya and App2?. I am a bit confused!.

Friday, September 26, 2008

Yet another Cool Feature of Axis2

It is possible to put the Axis2 transports, in maintenance mode so that the transport receivers refrains from accepting further requests and let the already accepted requests leave the transport layer within a specified number of milliseconds. This comes in handy when it comes to halting an Axis2 message receivers for some maintenance purpose where graceful shutdown is essential. This is done through a JMX management method, that is implemented by each transport implementation.

Asynchronous Web Services with Axis2

One of the coolest features in Axis2, is its support for asynchronous Web service invocations. Read this recent post on Danushka's Blog for more of it.

Wednesday, September 24, 2008

How to Increase Java Heap Memory for Maven 2

The heap size of the JVM used by Maven can be changed using the environment variable MAVEN_OPTS.

On Windows : set MAVEN_OPTS=-Xmx512m

On Linux (bash shell) : export MAVEN_OPTS=-Xmx512m

Google Chrome - How to Open a New Tab in a Separate Process?

I was surprised to see my Google Chrome launching a separate process for each newly opened tab. Actually I was under the impression that it was a bug as Chrome was in beta. But today I found this interesting piece of information, on Chrome site itself which explains that this behavior is intentional.

Google Chrome has a multi-process architecture which makes each opened tab run in a separate process rather than a separated thread.

I used the following simple HTML document to check this out.

I could observe on my Windows Task Manager that clicking the "Launch Google" button opens up Google on a new tab, that runs in a separate Chrome process.

Axis2 Transports are Now a Separate Project

The Apache Axis2 Transports are now a separate project under WS-Commons.

Transports are a part and parcel of Apache Axis2 and with the evolution of Axis2, a number of new transports have been introduced.

Apache Synapse has been developed on top of Axis2 and naturally the Axis2 transports are being used by Synapse as well. But with the evolution of Synapse, the need for a couple of new transports arose - FIX, VFS and AMQP to name a few - and due to obvious reasons, they were not propagated back to Axis2 and kept with Synapse itself. And also certain Axis2 transports were fine-tuned to meet certain requirements in the Synapse space. So, it is needless to say that this leads to hell of a lot of problems when it comes to maintaining the transports.

So, from a maintenance point of view, its a brilliant idea to move and nurture the transports as a separate project. But being a standalone project, the new WS-Commons transports may not be as useful as something like Apache Axiom, as these transports are tightly coupled with the Axis2 framework.

Sunday, September 21, 2008

SquirrelFish Extreme: Fastest JavaScript Engine Yet

The WebKit's newest JavaScript engine SquirrelFish Extreme is available now. This claims to be twice as fast as its predecessor (i.e. SquirrelFish), 35% faster than Google Chrome's V8 and 55% faster than Mozilla's TraceMonkey.

The following comparison chart shows how good the SFX engine is.

SFX uses four different technologies to deliver much better performance than the original SquirrelFish:
1. Bytecode optimizations

2. Polymorphic inline caching

3. Context threaded JIT compiler

4. Regular expression JIT compiler

This blog post on WebKit website explains these technologies in detail.

Monday, September 15, 2008

Gecko/WebKit Dichotomy

Many technology experts, specially those who are in the Web arena have started to talk about the WebKit rendering engine with the release of Google's Chrome browser which uses WebKit.

WebKit is an open-source HTML rendering engine that was developed by Apple. Being an extremely lightweight renderer and having smaller memory footprint have made it a popular choice for browser implementations. WebKit is mainly used in Apple's Safari and on the iPhone. On Google's side, it is used in Android mobile browser other than in Chrome.

Gecko was developed at Netscape and is known to be extremely powerful. But the issue with it is that most of its rich features come at an extra cost. Mostly they are memory overheads. And above all its code base is extensively complex.

The two main reasons behind the complex code base of Gecko is its XML-based user interface rendering framework called XUL and its component system called XPCOM. Mozilla's original application suite included a browser, a mail client, Web design tool and an IRC client. So XUL was supposed to be an all-in-renderer and not just a simple HTML renderer. XPCOM of course made the entire browser highly modular, but as stated earlier, all that comes at an extra cost.

When it came to Firefox 3, Gecko received a massive overhaul. Using the Cairo rendering framework enabled features like full-page zooming. Newest additions to Gecko includes support for CSS 3 which is already there in WebKit. With all this, Gecko is in a positions to compete without any hesitation with WebKit.

With all this, one of the most fascinating features of Gecko is that the new XUL runtime that comes with Firefox 3 will allow third-party application developers to come up with some cool applications that run on top of the Mozilla runtime.

So, nobody can ever say WebKit is better than Gecko or vise-versa as each has its own pros and cons.

Friday, September 12, 2008

Content Management Fitting any Structure

Today I happened to read this interesting paper base on a mechanism for dynamic content management of large and structurally heterogeneous websites. This system was a brainchild of two Stanford University grad students Isil Ozgener and Thomas Dillig and is currently deployed at http://fumarole.stanford.edu/.

I found their work interesting simply because I designed and developed something similar some time back.

I was working on the bill presentment module of a custom billing application sometime back. The presentment module is the one that carries out bill generation. Basically the presenter takes bill data, put them in a mould depending on a predefined layout format and finally generate bills in different content types.

Now this fits into the context of Isil and Thomas's work as different clients had different bill formats and hence the formatter was supposed to handle any dynamic bill layout.In my case of course I had only two types of information (AUI's according to the paper) namely Text and Images. But I had something sightly resembles scripting module AUI, computational logic block that takes care of basic arithmetic, sorting, grouping, etc. Another thing that was not there in my implementation was, it did not have the concept of page navigation.

My implementation had the concept of hierarchal blocks and it had different types of block such as Data Blocks, Summing Blocks, Sequence Blocks, Grouping Blocks, etc. So according to the paper's terms, these are different types of AUI.The complete bill with logic and layout was defined in a single XML file. The blocks were processed sequentially by as defined in the XML file. There was a C++ code generator that was making use of this XML file. The generated code was specific to a given content type. For an example, it was generating C++ code which was capable of generating LaTeX code to generate PDF files. For HTML, it was generating C++ code which was capable of generating raw HTML source. I had a really interesting concept for data-binding. There the data source was an in-memory data base which basically was an in-memory representation of a physical relational database. To elaborate more on this aspect, there was a C++ class representing each physical database table and the in-memory database was a collection of C++ objects lists.

In the XML bill format description, every block that was referring to a database table had a reference to the particular C++ object list. So the content generator was using the API calls of the in-memory database to retrieve data.

I am hoping to do a complete blog post on how all this was done and specially how the in-memory database worked.

Wednesday, September 10, 2008

Web1.0 vs. Web2.0 and Beyond

Today I was able to read an interesting paper done by Graham Cormode and Balachander Krishnamurthy of AT&T Research. It was a nice comparison between Web1.0 and Web2.0.

According to Graham and Balachander, the distinction between Web1.0 and Web2.0 could be seen from different perspectives such as, technological, structural and sociological aspects. From a technological perspective, Mashups, AJAX and Social networks are well known Web2.0 applications. Treating the user as a first-class object is a sociological aspect of Web2.0.

The authors also describe the challenging of crawling and scraping Web 2.0, and to build tools and new techniques to help this data collection. I think they are anticipating new techniques to crawl RIA's.

According the paper some of the attributes of a Web2.0 site include the following:

- Users as first class entities in the system.

- The ability to form connections between users.

- The ability to post content in many forms.

- Other more technical features, including a public API to allow third-party enhancements and
“mash-ups”, and embedding of various rich content types, and communication with other users through internal email or IM systems.

Graham and Balachander also anticipates what would happen beyond Web2.0.

The complete paper is available on the AT&T Research site.