Savage Garden: 2008

Wednesday, September 24, 2008

Why Tree Structured Cache is implemented in JBoss Cache?

The first version of TreeCache was essentially a single HashMap that replicated. However, the decision was taken to go with a tree structured cache because
(a) it is more flexible and efficient and
(b) a tree can always be reduced to a map, thereby offering both possibilities.

The efficiency argument was driven by concerns over replication overhead, and was that a value itself can be a rather sophisticated object, with aggregation pointing to other objects, or an object containing many fields. A small change in the object would therefore trigger the entire object (possibly the transitive closure over the object graph) to be serialized and propagated to the other nodes in the cluster. With a tree, only the modified nodes in the tree need to be serialized and propagated. This is not necessarily a concern for TreeCache, but is a vital requirement for

PojoCache.

With POJO cache, the feature of object cache by reachability is well supported by the tree structured cache. Object cache by reachability is basically recursive object mapping into the cache store. For example, if a POJO has a reference to another advised POJO, PojoCache will transparently manage the sub-object states as well. During the initial putObject() call, PojoCache will traverse the object tree and map it accordingly to the internal TreeCache nodes.

Another similar research can be found in the following paper:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.27.2819

Thursday, July 10, 2008

Setting Up StarHub Maxmobile (Huawei E270) on RHEL 5.1

There are posts from other people saying that this can be easily configured in ubuntu. However, that doesn't really applies on RHEL 5.1. Unfortunately, I am using RHEL 5.1 on my laptop as preconfigured by our IT helpdesk. I have no intention to reinstall a ubuntu or Fedora as I will have to figure out how to configure other stuff like VPN, email, etc all over again.

RHEL is mainly a server OS. I think most people don't use RHEL on desktop/laptop except redhat employees. I have spent some time figuring out how to configure the Huawei e270 modem and getting it work on RHEL. I think it might be useful to share the information here in case some people need to do the same.

First thing first, you need to make sure that the modem is recognized properly. Most often the modem is recognized as a USB storage instead.
1) Download usb_modeswitch from http://www.draisberghof.de/usb_modeswitch/ and extract the files to /opt/usb_modeswitch, for example.
2) Plug in Huawei E270 modem and execute the following command as root.

# /opt/usb_modeswitch-0.9.4/usb_modeswitch -v 0x12d1 -p 0x1003 -H 1

You may need to execute the same command a few times until you see the following message:

Looking for default devices
Found default devices (1)
Prepare switching, accessing latest device
OK, Huawei control message successfully sent.
-> See /proc/bus/usb/devices (or call lsusb) for changes. Bye

3) Now if you execute "ls /dev/ttyUSB*", you should able to see a list of USB devices.

Second step is to dial-in to StarHub 3G network. Login as root and execute the following commands:

# chmod -c a+rwx /etc/wvdial.conf
# chmod -c a+rwx /etc/ppp/pap-secrets
# chmod -c a+rwx /etc/ppp/chap-secrets

Edit the file /etc/wvdial.conf and add the following content:

[Dialer hsdpa]
Modem = /dev/ttyUSB0
Modem Type = Analog Modem
Baud = 460800
Init1 = ATZ
Init2 = ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
Init3 =
Phone = *99#
Username = star
Password = hub
Ask Password = off
Dial Command = ATDT
Stupid Mode = on
Compuserve = off
Force Address =
Idle Seconds = 0
DialMessage1 =
DialMessage2 =
ISDN = off
Check Def Route = on
Auto DNS = on

After these are done, execute the command "wvdial hsdpa" and you should be able to see the following output:

--> WvDial: Internet dialer version 1.54.0
--> Cannot get information for serial port.
--> Initializing modem.
--> Sending: ATZ
ATZ
OK
--> Sending: ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
OK
--> Modem initialized.
--> Sending: ATDT*99#
--> Waiting for carrier.
ATDT*99#
CONNECT
--> Carrier detected. Starting PPP immediately.
--> Starting pppd at Thu Jul 10 15:32:48 2008
--> pid of pppd: 5031
--> Using interface ppp0
--> local IP address 10.13.165.229
--> remote IP address 10.64.64.64
--> primary DNS address 203.116.1.78
--> secondary DNS address 203.116.254.150

Good luck!

Monday, June 16, 2008

Nine Trends Driving Business in 2008

Source: Wired Magazine: 16.04 (http://www.wired.com/techbiz/it/magazine/16-04/bz_opensource)

1: Open Source Tycoons

2: Social Networks Grow Up

3: Green on the Outside

4: Invisible Internet

5: Rise of the Instapreneur

6: Building a Better Banner

7: Invented in China

8: VCs Look for a New Life

9: The Human Touch

Open Source as Career Opportunity for Developers

Esther Shindler, editor at CIO Magazine, has a great article on this topic. She says:

Sometimes, there isn’t much you can do to kick-start your career. Not everyone can be lucky enough to get involved in a high-profile project at work, or to develop a talent in a technology that’s suddenly in-demand. But it surprises me when IT professionals who aim to move up the career ladders don’t take advantage of one resource that’s a win-win solution all around: get involved in an open source project.

This is particularly important to women in IT, who can feel that it’s hard to get noticed in their companies (see The Executive Woman’s Guide to Self-Promotion for general guidelines on how to counter that problem). But it really applies to anyone who wants to gain experience and visibility in the IT department, even if you don’t care about becoming a rock star.

As a participant in an open source project, everything is in your control. You pick the project that you think is the most valuable, or in which you can develop the skills you need but can’t justify on your résumé. In the universe of open source, you’re judged only by what you contribute. Corporate politics aren’t an issue. If your code is useful, or your technical documentation is appreciated, or you’re just a welcoming voice on the community IRC channel, you have a good chance of being invited to become a committer.

Linux Desktop Applications

OpenProj by Projity is a desktop replacement of Microsoft Project. OpenProj has equivalent functionality, a familiar user interface and even opens existing MSProject files. OpenProj is interoperable with Project, with a Gantt Chart and PERT chart etc.

Google Desktop enables desktop search with Google and add Google Gadgets to customize your desktop.

CUPS-PDF project is a PDF writer backend for CUPS. It is designed to produce PDF files in a heterogeneous network by providing a PDF printer on the central fileserver. It will convert files printed to its queue in CUPS to PDF and put them in a per-user-based directory structure. It can execute post-processing scripts, e.g. to allow mailing the results to the user.

OpenOffice.org is a cross-platform office application suite available for a number of different computer operating systems. It supports the ISO standard OpenDocument Format (ODF) for data interchange as its default file format, as well as Microsoft Office '97-2003 formats, among many others. The functionality of OpenOffice is enriched with the extensions like templates, addons, etc.

Pimp your desktop: automate desktop wallpaper with Webilder.

Firefox 3 web browser is available for download since 17th June. With more than 15,000 improvements, Firefox 3 is faster, safer and smarter than ever before.

Thursday, June 05, 2008

The Five Open Source Business Models

Open source has become standard in Silicon Valley, with nearly every software startup planning to release at least some code. So far, they've found five main business models:

1. Sell support services. This is the traditional Linux model, prototyped by Red Hat. It's still a part of most open-source business plans, but on its own it's rarely enough for startups trying to grow. The problem (for the startups) is that anyone can redistribute the code and sell support or consultancy services, so there's nothing to stop an Oracle (NSDQ: ORCL), an IBM (NYSE: IBM), or a Novell (NSDQ: NOVL) from grabbing most of the services revenue.

2. Build (or run) hardware. Free software ought to make hardware more profitable, but relatively few open source companies have taken the hardware route. (Lots of hardware vendors use open-source in their products, of course, but they're not really open-source companies.) The main reason is that installing software on commodity components has even fewer barriers to entry than selling support, as VA Linux showed during the first bubble. Still, some startups have resurrected the idea, notably Vyatta (router) and SocialText (wiki appliance).

3. Proprietary components. Many startups now combine proprietary and open-source code, essentially holding back some functionality from what they release for free. The most successful to use this model so far was VMWare competitor XenSource (now part of Citrix (NSDQ: CTXS)), which gave away the Xen hypervisor but sold its proprietary management software.

Competitor VirtualIron does exactly the same thing, collaborating with Citrix on Xen but competing on management. XenSource's success has made this a popular strategy for other open source startups such as MuleSource (SOA) and Hyperic (systems management.) It also gives established software vendors a clear path to open source.

4. Dual licensing. Some customers just don't want to follow open-source licenses (usually the GPL), so many open-source vendors will happily sell them proprietary licenses for the same software. This works well for companies like Trolltech and MySQL, and it could become more popular thanks to new open-source licenses that place tighter restrictions on what other vendors can do for free.

For example, the limits on home DRM in GPL v3 are intended to make consumer electronics more open, but they could eventually give open-source companies a revenue stream from DRM vendors who want the code without the license. The (so far little-used) AGPL could have an even bigger impact, thanks to its requirement that SaaS users be able to download the source code. The big disadvantage for startups taking this approach is that they can't easily leverage community development, as they need to hold copyright on all code.

5. Advertising. The Mozilla Foundation discovered this almost by accident, when Google (NSDQ: GOOG) paid the Firefox developers so much in referral fees that they had to incorporate as a for-profit. It's also used (along with the other four) by Digium, the main backer of the free Asterisk PBX, which comes pre-configured to connect to particular IP telephony services. I expect we'll see more startups embrace this idea as SaaS becomes more common and ASPs offer big bucks for customer leads.

Wednesday, June 04, 2008

What is the open source business model?

It is often confusing to people to learn that an open source company may give its products away for free or for a minimal cost.

How do open source companies make money?

While it is true that an open source business may not make money directly from its products, it is untrue that open source companies do not generate stable and scalable revenue streams.

In actuality, in the 21st century web technology market, it is the open source company that has the greatest long-term strategic advantage. This is demonstrated by companies such as LINUX, Apache, and Netscape, a host of web-specific technologies such as Java, Perl, TCL, and a host of web-specific technology companies such as Sendmail.

The open source business model relies on shifting the commercial value away from the actual products and generating revenue from the 'Product Halo,' or ancillary services like systems integration, support, tutorials and documentation.)

This focus on the product halo is rooted in the firm understanding that in the real-world, the value of software lies in the value-added services of the product halo and not in the product or any intellectual property that the product represents.

In actuality, the value of software products approaches zero in the fast-paced, highly-customized, ever-changing world of information technology.

But it is not simply an acknowledgement of the revenue streams generated by the product halo that makes open source a compelling business strategy.

Open source also cuts down on essential research and development costs while at the same time speeding up delivery of new products.

This paradoxical situation arises from the fact that within an open source project, the community members themselves provide free research and development by contributing new solutions, features, and ideas back to the community as a whole. The company that sits at the center of any successful open source project may reap the rewards of the work of thousands of highly-skilled developers without paying them a cent.

A final strength of the open source business model lies in its ability to market itself.

Because open source products are typically released for free, open source companies that can produce quality products and generate a good reputation can almost immediately grab huge shares of any market based on the complex and far-reaching global referral networks generated by users.

In fact, in the web technology space, almost every global standard has been based upon open source technology.

By using the open source technology model, we can create a superior product, which immediately has a competitive advantage, and which generates multiple scalable revenue streams while being freely available throughout the community.

Thursday, May 22, 2008

Cloud Computing

Here's a rough breakdown of what cloud computing is all about:

1. SaaS (Software-as-a-Service)
This type of cloud computing delivers a single application through the browser to thousands of customers using a multitenant architecture. On the customer side, it means no upfront investment in servers or software licensing; on the provider side, with just one app to maintain, costs are low compared to conventional hosting. Salesforce.com is by far the best-known example among enterprise applications, but SaaS is also common for HR apps and has even worked its way up the food chain to ERP, with players such as Workday. And who could have predicted the sudden rise of SaaS "desktop" applications, such as Google Apps and Zoho Office?

2. Utility computing
The idea is not new, but this form of cloud computing is getting new life from Amazon.com, Sun, IBM, and others who now offer storage and virtual servers that IT can access on demand. Early enterprise adopters mainly use utility computing for supplemental, non-mission-critical needs, but one day, they may replace parts of the datacenter. Other providers offer solutions that help IT create virtual datacenters from commodity servers, such as 3Tera's AppLogic and Cohesive Flexible Technologies' Elastic Server on Demand. Liquid Computing's LiquidQ offers similar capabilities, enabling IT to stitch together memory, I/O, storage, and computational capacity as a virtualized resource pool available over the network.

3. Web services in the cloud
Closely related to SaaS, Web service providers offer APIs that enable developers to exploit functionality over the Internet, rather than delivering full-blown applications. They range from providers offering discrete business services -- such as Strike Iron and Xignite -- to the full range of APIs offered by Google Maps, ADP payroll processing, the U.S. Postal Service, Bloomberg, and even conventional credit card processing services.

4. Platform as a service
Another SaaS variation, this form of cloud computing delivers development environments as a service. You build your own applications that run on the provider's infrastructure and are delivered to your users via the Internet from the provider's servers. Like Legos, these services are constrained by the vendor's design and capabilities, so you don't get complete freedom, but you do get predictability and pre-integration. Prime examples include Salesforce.com's Force.com, Coghead and the new Google App Engine. For extremely lightweight development, cloud-based mashup platforms abound, such as Yahoo Pipes or Dapper.net.

5. MSP (managed service providers)
One of the oldest forms of cloud computing, a managed service is basically an application exposed to IT rather than to end-users, such as a virus scanning service for e-mail or an application monitoring service (which Mercury, among others, provides). Managed security services delivered by SecureWorks, IBM, and Verizon fall into this category, as do such cloud-based anti-spam services as Postini, recently acquired by Google. Other offerings include desktop management services, such as those offered by CenterBeam or Everdream.

6. Service commerce platforms
A hybrid of SaaS and MSP, this cloud computing service offers a service hub that users interact with. They're most common in trading environments, such as expense management systems that allow users to order travel or secretarial services from a common platform that then coordinates the service delivery and pricing within the specifications set by the user. Think of it as an automated service bureau. Well-known examples include Rearden Commerce and Ariba.

7. Internet integration
The integration of cloud-based services is in its early days. OpSource, which mainly concerns itself with serving SaaS providers, recently introduced the OpSource Services Bus, which employs in-the-cloud integration technology from a little startup called Boomi. SaaS provider Workday recently acquired another player in this space, CapeClear, an ESB (enterprise service bus) provider that was edging toward b-to-b integration. Way ahead of its time, Grand Central -- which wanted to be a universal "bus in the cloud" to connect SaaS providers and provide integrated solutions to customers -- flamed out in 2005.

Thursday, May 15, 2008

Reference Objects and Garbage Collection

Unreachable objects (outside the strongly and weakly reachable areas) are not reachable from the root set.

Strongly reachable objects (inside the strongly reachable area to the lower left) are reachable through at least one path that does not go through a reference object.

Weakly reachable objects (inside the weakly reachable area shown enclosed with a dashed line to the upper-right) are not strongly reachable through any path, but reachable through at least one path that goes through a weak reference.

Weakly reachable objects are finalized some time after their weak references have been cleared. The only real difference between a soft reference and a weak reference is that the garbage collector uses algorithms to decide whether or not to reclaim a softly reachable object, but always reclaims a weakly reachable object.

Java API Package: java.lang.ref.*

Monday, March 24, 2008

JDB Example: Generating a Thread Dump

Every once in a while I run into a situation where the usual means for generating a thread dump (stack trace) does not work, making it difficult to track down pesky deadlocks. This seems somewhat more common under OSX, but I’ve seen it happen under Linux as well.

Typically you can type Control-&backslash; (Control-Break on Windows), or send a process the QUIT signal (e.g., kill -QUIT ) to dump a trace of all active threads in the Java VM. Intermittently, however, I have found that this does not work for java processes that are launched via a shell script. The Control-&backslash; is sometimes simply ignored, while the QUIT signal appears to kill the script, but without dumping a stack trace or killing the java process. (This suggests that in either case, the script is probably just intercepting the signal. There’s probably a fix for that, but I haven’t explored it.)

The jdb debugger utility, included as part of the JDK, provides an alternate way to get a stack trace. First, you must pass two additional arguments to the java vm to tell it to listen for connections from the java debugger. On OSX, Linux, or UNIX, this looks like:

java -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=n class

This will cause the VM to listen for debugger connections on port 8000. You may use any usued port number that the account in which the program is running may listen. For non-priviledged accounts, this typically means any port number over 1024. For security reasons you might not want to include these options on a production system, but for testing and debugging they impose no measurable performance penalty.

As an example, here is a simple DeadlockDemo class that will produce a deadlock at some indeterminate point. It creates three threads, each of which will repeatedly try to randomly acquire two locks. When a thread successfully gets both locks, it will update a counter. Eventually two threads will try to acquire the same two locks in the opposite order, resulting in a deadlock. Deadlock situations are, of couse, not the only cases where stack traces are useful, but are probably the most common:

import java.awt.*;
import javax.swing.*;

/**
* Demo program that should (eventually) produce a deadlock
*/
public class DeadlockDemo extends JFrame implements Runnable {
   /**
    * Label with counter of iterations before deadlock
    */
   private JLabel text = null;

   /**
    * Counter of number of updates
    */
   private int count = 0;

   /**
    * Set of objects to randomly synchronize on
    */
   private Object[] locks = null;

   public DeadlockDemo() {
       setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
       text = new JLabel("Starting up...", JLabel.CENTER);
       setLayout(new BorderLayout());
       add(text, BorderLayout.CENTER);

       locks = new Object[] { new Object(), new Object(), new Object() };
   }

   public void startSomeThreads() {
       for (int i = 0; i < 3; i++) {
           Thread runner = new Thread(this);
           runner.setName("Runner Thread " + i);
           runner.start();
       }
   }

   public void run() {
       while (true) {
           // Pick two locks at random to synchronize on. Eventually
           // two threads will try to acquire the same two locks in
           // the opposite order, resulting in a deadlock.
           synchronized(locks[(int) (Math.random() * locks.length)]) {
               synchronized(locks[(int) (Math.random() * locks.length)]) {
                   count++;
                   synchronized(text) {
                       text.setText("Counter: " + count);
                   }
               }
           }
       }
   }

   public static void main(String[] args) {
       DeadlockDemo demo = new DeadlockDemo();
       demo.setSize(200, 200);
       demo.setVisible(true);
       demo.startSomeThreads();
   }
}

Figure 1. DeadlockDemo, a program that (eventually) deadlocks.

You can run this class from the command line thusly:

java -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=n DeadlockDemo

To get a thread dump, you first need to attach the debugger. In another terminal window, type:

jdb -attach 8000

Note that the port number (8000, in this example) must match the port number that you provided when you launched the virtual machine. You will see:

Set uncaught java.lang.Throwable
Set deferred uncaught java.lang.Throwable
Initializing jdb ...
>

At the jdb prompt, enter “suspend” to temporarily suspend all running threads in the VM. Your program will become unresponsive after you do this. Next, enter “where all” to generate the thread dump. Here is a complete example:

Set uncaught java.lang.Throwable
Set deferred uncaught java.lang.Throwable
Initializing jdb ...
> suspend
All threads suspended.
> where all
DestroyJavaVM:
Runner Thread 2:
 [1] DeadlockDemo.run (DeadlockDemo.java:47)
 [2] java.lang.Thread.run (Thread.java:613)
Runner Thread 1:
 [1] DeadlockDemo.run (DeadlockDemo.java:47)
 [2] java.lang.Thread.run (Thread.java:613)
Runner Thread 0:
 [1] DeadlockDemo.run (DeadlockDemo.java:47)
 [2] java.lang.Thread.run (Thread.java:613)
AWT-EventQueue-0:
 [1] java.lang.Object.wait (native method)
 [2] java.lang.Object.wait (Object.java:474)
 [3] java.awt.EventQueue.getNextEvent (EventQueue.java:345)
 [4] java.awt.EventDispatchThread.pumpOneEventForHierarchy (EventDispatchThread.java:216)
 [5] java.awt.EventDispatchThread.pumpEventsForHierarchy (EventDispatchThread.java:190)
 [6] java.awt.EventDispatchThread.pumpEvents (EventDispatchThread.java:184)
 [7] java.awt.EventDispatchThread.pumpEvents (EventDispatchThread.java:176)
 [8] java.awt.EventDispatchThread.run (EventDispatchThread.java:110)
Java2D Disposer:
 [1] java.lang.Object.wait (native method)
 [2] java.lang.ref.ReferenceQueue.remove (ReferenceQueue.java:116)
 [3] java.lang.ref.ReferenceQueue.remove (ReferenceQueue.java:132)
 [4] sun.java2d.Disposer.run (Disposer.java:123)
 [5] java.lang.Thread.run (Thread.java:613)
AWT-Shutdown:
 [1] java.lang.Object.wait (native method)
 [2] java.lang.Object.wait (Object.java:474)
 [3] sun.awt.AWTAutoShutdown.run (AWTAutoShutdown.java:259)
 [4] java.lang.Thread.run (Thread.java:613)
AWT-AppKit:
Signal Dispatcher:
Finalizer:
 [1] java.lang.Object.wait (native method)
 [2] java.lang.ref.ReferenceQueue.remove (ReferenceQueue.java:116)
 [3] java.lang.ref.ReferenceQueue.remove (ReferenceQueue.java:132)
 [4] java.lang.ref.Finalizer$FinalizerThread.run (Finalizer.java:159)
Reference Handler:
 [1] java.lang.Object.wait (native method)
 [2] java.lang.Object.wait (Object.java:474)
 [3] java.lang.ref.Reference$ReferenceHandler.run (Reference.java:116)

Figure 2. A stack trace produced within jdb.

Here we see that our three “Runner Thread n” threads are all stuck at line 47 in the run method, each waiting for a lock that one of its siblings is holding. The user interface is not blocked, as the event dispatch thread (”AWT-EventQueue-0“) is waiting for another input event. The other threads are handling various background tasks in the VM.

To exit jdb, type “quit“. Your program, with the exception of any deadlocked threads, will become responsive again. If you do not wish to exit out of jdb, you can type “resume” instead of “quit” to undo the effects of “suspend“. Note that you can only generate a thread dump when threads are suspended.

Windows

On Windows, a slightly different mechanism is used for communication between the Java virtual machine and JDB. Rather than specifying a port for the connection, you tell the VM to use shared memory. Here’s what the command would look like for the DeadlockDemo example on Windows:

java -Xdebug -Xrunjdwp:transport=dt_shmem,server=y,suspend=n DeadlockDemo

To connect jdb to the virtual machine, enter:

jdb -attach jdbconn

Once connected, the commands used within jdb (suspend, where all, and quit) are the same as on OSX and Linux.

Another Alternative:

On Windows the 'standard', ie. default, connector uses the shared memory transport.

To connect to a socket, you need to tell jdb to use the socket attaching connector, ie

jdb -connect com.sun.jdi.SocketAttach:port=9000

It is hard to remember the names and params of all the connectors. You can do
jdb -listconnectors

Thursday, March 13, 2008

Comments from Grady Booch on SOA

Grady Booch, a father of UML and now an IBM fellow, made this comment about SOA in his blog in March 2006:

"My take on the whole SOA scene is a bit edgier than most that I’ve seen. Too much of the press about SOA makes it look like it’s the best thing since punched cards. SOA will apparently not only transform your organization and make you more agile and innovative, but your teenagers will start talking to you and you’ll become a better lover. Or a better shot if your name happens to be Dick. Furthermore, if you follow many of these pitches, it appears that you can do so with hardly any pain: just scrape your existing assets, plant services here, there, and younder [sic], wire them together and suddenly you’ll be virtualized, automatized, and servicized.

What rubbish."

Booch is right. The important thing is that SOA is a strategy that requires time and effort. SOA is a multi-year journey. You need some experience to understand what SOA really is about, and where and how it helps.

And, in IT, each system is different. As a consequence, you will have to build your specific SOA—you can’t buy it. To craft it, you’ll need time and an incremental and iterative approach.

进化论与SOA

"It is not the strongest of the species that survive, nor the most intelligent, but the ones most responsive to change."

The key is flexibility. SOA makes enterprises more responsive to changes.

SOA Design Fundamentals

A design principle is an accepted design guideline or practice that, when applied, results in the realization of specific design characteristics.
A design paradigm represents a set of complementary design principles that are collectively applied in support of common goals.
A design pattern identifies a common problem and provides a recommended solution.
A design standard is a convention internal and specific to an enterprise that may or may not be derived from a design principle or pattern.

Wednesday, March 12, 2008

Lessons Learned in SOA

Start Governance early: SOA Integration and SOA Governance need to work together to give you the benefits of agility, cost-reduction, and reduced risk.
Don't do SOA in an IT vacuum: With SOA and BPM ensure that business processes are optimized to realize the benefit of SOA for the entire organization.
Start small, but think holistically: Small projects are a great starting point, but services quickly develop into composite services. These composite services need to be managed, designed, and deployed centrally, as well as, work together with the SOA Integration solution.

Tuesday, March 11, 2008

SOA Security

Definition of Terms

Policy Enforcement Point (PEP)
the application-specific element that is physically enforcing access to a resource.

Policy Decision Point (PDP)
where decisions are made based on policies.

Policy Administration Point (PAP)
a tool to provide management to entitlement policy.

Policy Information Point (PIP)
data repository for the PDP, inlcuding attributes and policies.

Representational State Transfer (REST)

Definition: Representational State Transfer (REST) is an architectural style of large-scale networked software that takes advantage of the technologies and protocols of the World Wide Web. REST describes how distributed data objects, or resources, can be defined and addressed, stressing the easy exchange of information and scalability.

In 2000, Roy Fielding, one of the primary authors of the HTTP specification, wrote a doctoral dissertation titled "Architectural Styles and the Design of Network-based Software Architectures". In it, he coined the term “Representational State Transfer” to describe the networking principles that characterize the World Wide Web.

In the broadest terms, REST outlines how to define and address sources of specific information, commonly known as resources. Resources are referred to individually with a universal resource identifier, such as the URL used for Web addresses. The term REST often describes any simple interface used to transmit domain-specific data over HTTP without the need for additional messaging layers or session tracking.

REST is an architectural style, not a standard or implementation specification. The largest REST application is the Web itself, characterized by the use of HTTP for transport and URLs as addressing mechanisms. REST can support any type of media, and XML is the most popular method used to transport and represent structured information. REST is used with HTML, XHTML, RSS and proprietary XML vocabularies.

Systems that follow Fielding’s REST principles can be called RESTful, and some of REST’s advocates call themselves RESTafarians. But REST isn’t the only possible approach to building network applications, and there is some disagreement as to whether another might be preferable.

Basics of REST

REST involves several basic notions:

Data elements. Resources (such as data objects), resource identifiers (network addresses, URLs), and representations of resources (HTML documents, JPEG images) are accessed through a standardized interface such as HTTP.

•Components. Origin servers, gateways, proxies and user agents communicate by transferring representations of resources through the interface, not by operating directly on the resources themselves. This is generally done using well-defined operations such as Get and Put.

Connectors. Clients, servers and caches, as well as tunnels such as Socks and SSL connections, present an abstract interface for communication, hiding the implementation details of communication mechanisms.

Stateless interaction. All requests made to connectors must contain all the information necessary to understand that request without depending on any previous request. This contrasts with the way many Web sites use cookies to maintain data between sessions. With REST, all messages must include all information necessary to understand the context of an interaction.

Why REST?

Many developers find REST challenging because it requires them to rethink their problems in terms of manipulating addressable resources instead of calling another routine to do something with that data. With RESTful design, Web services can be seen as simply a means of publishing information, components and processes to make them accessible to other users and machine processes. For example, the Atom Publishing Protocol, a RESTful application widely used for blogs, simplifies the process of publishing information and makes processes available to others so they can interact with that information. In general, REST requires less client-side software than do other approaches, because a single, standard browser can access any application and data resource.

Sunday, March 02, 2008

Dont's for SOA

Don't boil the ocean. Make sure the SOA project you choose for your starting point is well defined and well confined. Prove SOA successful with something that is small, is achievable in a short time, and will have a significant impact — then build incrementally.
Don't confuse SOA with an IT initiative. SOA must be a joint endeavor between business and IT. You have everything to gain — and everything to lose if you persist in such pigheadedness.
Don't go it alone. An entire industry is just waiting out there to help you. Don’t ignore it. Beg, borrow, steal, but get help. Reinventing the world is definitely anti-SOA thinking.
Don't think you are special. Stick to standards and standard interfaces. The proprietary software you build will be your own downfall. The sooner you part ways from evil temptations, the happier and healthier your software can be. (The happier and healthier your organization will be too, by the way.)
Don't neglect governance. SOA governance won’t happen by itself. Address it early. SOA governance is as much about the way you work and the processes you put in place to create a SOA environment as it is about any technology issues. So, don’t just go and buy a bucket full of tools labeled SOA governance. SOA governance is about leadership and thinking through how you are going to get from where you are today to a well-coordinated approach that conforms to your corporate goals and objectives.
Don't forget about security. In this brand new world of mixing and matching, it’s easy to get caught up in the euphoria and forget about the nitty-gritty. Pay close attention to the security implications of exposing business services.
Don't apply SOA to everything. SOA makes a lot of sense for a lot of things, but not for everything. If you have an application that is so specialized that it is isolated from other aspects of the business and works just fine, leave it alone. At the same time, when you find the software that is appropriate for SOA, you need to prioritize, scrutinize, and make sure you’re looking at the right level of granularity.
Don't start from scratch. Chances are, one of the SOA vendors has some sort of blueprint for a company just like yours. Take advantage of work already done. Look for a blueprint or model based on your industry first, such as insurance or financial services or banking — many already exist and more are being created every day.
Don't postpone SOA. SOA is a long journey. The sooner you begin, the sooner you’ll get somewhere.

Friday, February 29, 2008

Anti-Patterns

What are Anti-Patterns? Anti-Patterns are common pitfalls that increase complexity with no benefit, decrease flexibility, increase costs and cause project failures.

Learning from others' mistakes is less painful than learning from our own mistakes. Ways to apply anti-patterns inlcude the following:

Avoidance - Prevent the anti-pattern from happening.
Remediation - Undo or correct the anti-pattern.
Mitigation - Lessen the impact of the anti-pattern.
Containment - Prevent the damage of the anti-pattern from spreading.

Thursday, February 28, 2008

WSDL Document Structure

WSDL — the Web Services Description Language — is yet another language that was built by using XML. It is used to define a Web service. Like a SOAP message, a WSDL document is divided into four parts:

Definition of ports: We don’t really like the use of the word port here, but we’re stuck with it. A port is a connection point. The WSDL port defines a Web service, the operations that can be requested, and the messages that can be used. In other words, the WSDL description defines what you can do and how you do it after you connect to the port. Another way of describing it is that it is an XML definition of a program function.

Definition of messages: Here you see the definitions of the data items for each of the operations that are defined under the specification of the port. These definitions act as templates for the requesting program to make requests and for it to understand the responses it receives. In reality, these definitions are what a programmer thinks of as function calls. And as you may expect, they relate to a name space.

Definition of types: This defines the data types that are used by the Web service. These relate to a name space as well.

Definition of binding: This is technical stuff for the programs involved. It defines exactly how the two programs can connect to each other.

Data Redaction

Data Redaction refers to the function of reducing or limiting the set of data returned to a user based on the entitlements of that user.

Separation of Concerns

The separation of business logic (what an application does) from computer logic (how the computer is directed to do it) is known as the separation of concerns and is a software engineering best practice that should be applied in the design of all technology systems intended for business users. Unfortunately, this best practice has been observed more in theory than in practice. If you discuss this issue with software engineers, you may hear many excuses. The separation of concerns is often ignored simply because it takes effort to abide by it, and the costs of ignoring it are all in the future — in other words, too often, “quick and dirty” wins out over “slow and sure.” Another pernicious factor thwarting the separation of concerns is the perennial desire of some IT vendors to lock your business logic into their proprietary technology. (Never underestimate the greed factor.)

Creating a reusable architecture takes discipline. And discipline inevitably takes more time than you’d ever expect to establish itself. Management may need to be educated. The upfront costs of establishing and requiring discipline pay manifold dividends over time.

Wednesday, February 27, 2008

Reliable Web Services

Web Service Reliable Messaging is a framework whereby an application running in one application server can reliably invoke a Web Service running on another application server, assuming that both servers implement the WS-ReliableMessaging specification. Reliability is defined as the ability to guarantee message delivery between the two Web Services.

BEA's implementation of WS-ReliableMessaging specification is heavily dependent on the SAF (Store and Forward) feature provided by JMS.

WSRM delivery assurances:

AtMostOnce
AtLeastOnce
ExactlyOnce
InOrder

Tuesday, February 26, 2008

Message Level Security for Web Services

Message Level encryption creates end-to-end confidentiality instead of point-to-point confidentiality.

Message Level Security lies above transport level security. Message level security allows specific parts of a SOAP message to be encrypted or digitally signed before it is put "on the wire".

Message Level Security addresses the same security requirements as traditional Web Security, that is, authentication, authorization, integrity, confidentiality and non-repudiation.

Message Level Security makes security possible by embedding the security information in a message's SOAP header. The SOAP message itself either contains the information needed to secure the message (by digitally signing or encryption) or it contains information about where to get that information to handle security needs.

Services Orchestration and Consumption

In WebLogic Integration, process automation is implemented through Java Process Definitions (JPDs). A JPD is an annotated Java class that is compiled into an EJB, using the annotations to generate code. WLI's sweet spot is in visually designing long-lived, asynchronous, multi-step processes that would otherwise be prohibitively complex to implement.

Service Orchestration is typically performed in:

Java Process Definitions in WebLogic Integration.
Process Models in AquaLogic BPM Suite.
Proxy Services in AquaLogic Service Bus.

Simple service orchestration can be implemented in Apache Beehive Java Web Services in WebLogic Server.

If an asynchronous operation is invoked on the Web Service, the client may proceed with processing in parallel with the service processing. However, the client may require a response to the asynchronous call. One way to accomplish this is to leverage a callback whereby the service calls an interface on the client. However, callbacks cannot always be used to nitify the client of request completion. To handle this, a Web Service can provide a polling interface.

Polling and Callbacks are two mechanisms to achieve asynchrony. You might implement both approaches, to handle all situations.

Thursday, February 14, 2008

Canonical Data Model

I am designing several applications to work together through Messaging. Each application has its own internal data format.

How can you minimize dependencies when integrating applications that use different data formats?

Therefore, design a Canonical Data Model that is independent from any specific application. Require each application to produce and consume messages in this common format.

The Canonical Data Model provides an additional level of indirection between application's individual data formats. If a new application is added to the integration solution only transformation between the Canonical Data Model has to created, independent from the number of applications that already participate.

Sunday, February 10, 2008

Top Five Java Technologies to Learn in 2008

#5 OSGI - Reality check, monolithic containers carry too much baggage and Java libraries are so richly cross dependent. The trend is there, a lot of frameworks are moving towards OSGI to bring some sanity in their deployment. Projects that have employed OSGI in anger are Eclipse via Equinox, Nuxeo and BEA Event Server,
#4 JCR - Reality check, not all data fits well within a relational database. In most cases, users want to store their own documents and have those properly managed (i.e. versioned). JCR with it's Jackrabbit implementation is becoming the de-facto standard for maintaining data other than the structured kind. Some examples of projects that have used this in unexpected and innovative ways are Drools BRMS for managing business rules, Apache Sling for universal resource storage and Mule Galaxy for SOA governance management.
#3 GWT - Reality check, AJAX is here to stay and Javascript is still a pain to work with. GWT is gaining traction like wildfire at the expense of other Java web technologies like JSF. A lot of projects have begun creating extremely cool products with it. Some impressive examples are Queplix a CRM, Compiere an ERP and GPokr a multiplayer Texas hold-em poker game.
#2 Groovy - Reality check, sometimes you have to write quick and dirty scripts to get your tasks done quickly. There's a lot of traction these days for dynamic scripting languages like Ruby. However if you want to truly leverage your existing skill set, then it's more efficient to take a evolutionary step. Groovy has come a long way since it's rocky beginnings. I believe Groovy is finally mature enough (it finally has a debugger) that it's safe to dip your toes in it. Furthermore, there's are a of books, books about frameworks (i.e. Grails) and tools (i.e. IntelliJ) that help you from getting lost.
#1 Cloud Computing - Reality check, sometimes it just isn't worth it to setup your own physical servers. Amazon's services are going to be an extreme boon to development productivity. One of the most time consuming efforts, and one that is too often taken for granted, is the deployment of a load and performance testing harness. In a lot of rigid organization, it is sometime problematic to acquire so much hardware for use only for short time periods. There aren't many tools out there yet for the Java developer (see: "Grid Gain Distributed JUnit"), however it's ramping up pretty quickly. So just as we create our builds from the cloud via Maven repositories, one shouldn't be surprised to find cloud based testing resources to be part of every developer's tool chain in the future.

Friday, January 11, 2008

Document-Based Web Services

In the RPC-based interaction, the web service is viewed by the consumer as a single logical application or component with encapsulated data, where the WSDL described by the
publicly-exposed interface and the XML in the SOAP messages exchanged is formatted to map to the discrete operations exposed by that application. In fact, the messages directly
map onto input and output parameters of the procedure calls or operations. Typically, such invocations occur over a synchronous transport protocol like HTTP, where the SOAP
request and response is piggybacked on the protocol-level request-and-response, respectively, to form synchronous request-response interaction patterns. For example, see Figure
1, which illustrates a payment service that accepts payments and returns a status, or a stock quote service that accepts a ticker symbol and returns the current quote in the HTTP
response.

In a document-based interaction, the service consumer interacts with the service using documents that are meant to be processed as complete entities. These documents typically
take the form of XML, which is defined by a commonly agreed upon schema between the service provider and service consumer. It is also possible that the document exchanged in
such an interaction could be in a format other than XML (such as encrypted files); however, the value of agreeing on a XML schema is to facilitate interoperability. In other words,
the document represents a complete unit of information and may be completely self-describing.

Document-based message exchanges are more common to asynchronous communication architectures. Also, the effort and complexity involved in building a document-oriented web service is usually more than the effort involved in using an RPC-based architecture. This is because it involves extra steps, such as designing the schema for the documents that will be
exchanged, negotiating and arriving at an agreement with business partners on that design, and validating the document against the schema.

A SOAP message on the wire can be represented in either RPC style or document style. This choice is governed by the value of the style attribute in the WSDL file.

Development of a document-driven web service typically starts with the definition of the schemas and the WSDL describing the document exchange.

Wednesday, January 02, 2008

Asynchronous Web Service

The use of WS-Addressing 1.0 is automatically performed by WebLogic Server when implementing asynchronous web service features.

Buffered oneway operations are implemented by Web Service tier.

With buffered oneway operations, the generic client invokes the service synchronously. Beneath the covers, the synchronous call only completes upon the delivery of the request on the server-side JMS queue. Once the request is delivered successfully to the JMS queue, the client is unblocked and proceeds with its client-side execution. The JMS queue "weblogic.wsee.DefaultQueue" is used by default, although the buffered operation may specify a custom JMS queue if desired.

The oneway annotation @Oneway() marks a web service operation as buffered: the operation must return void. The optional @BufferQueue annotation specifies the JMS Queue used for a buffered operation. Without this annotation, the default queue "weblogic.wsee.DefaultQueue" will be used.

Asynchronous request-response operations are implemented by client tier.

In asynchronous request-response model, the client request is delivered to a client-side asynchronous handler. The Async Handler synchronously invokes the remote operation, and triggers the response handler method in the client to process the response.

The @ServiceClient annotation specifies the variables used to call a desired service asynchronously.
The @AsyncResponse annotation specifies the method to handle a response to an async request.
The @AsyncFailure annotation specifies the method to handle an error from an async request.

Happy Path

In the context of software, a happy path is a default scenario that features no exceptional or error conditions. For example, the happy path for a function that validates credit card numbers would be the one where none of the validation rules raise an error, thus letting execution continue successfully to the end, generating a positive response.

- Wikipedia

XML Basics

The XML schema document is an XML document that defines a set of valid XML documents.

A well-formed XML document follows the rules of the XML specification. A valid XML document conforms to a particular XML schema.

XML schemas define data models. XML data binding technologies use data models to translate XML document instances into native data objects. Some of the XML data binding technologies are Castor, JAXB, and XMLBeans (http://xmlbeans.apache.org).

XPath is a language for finding information in XML documents. XPath uses path expressions to navigate XML. XPath is a W3C standard (http://www.w3.org/TR/xpath).

XQuery is often referred as "SQL for XML". XQuery leverages XPath expressions. http://www.w3.org/TR/xquery. The XMLBeans API provides execQuery method for executing XQuery expressions. An XMLBean can be transformed into another XMLBean using XQuery.

Enterprise Web Services

A Port is an instance of a Web Service object. The Port terminology is defined by JSR-109, which states: "A service instance, called a Port, is created and managed by the server."

If a JWS file does not implement a service endpoint interface, all public methods other than those inherited from java.lang.Object will be exposed as Web Service operations. You can override this behavior by using the @WebMethod annotation to specify explicitly those public methods that are to be exposed. If @WebMethod annotation is present, only the methods to which it is applied are exposed.

Web service operations are equivalent to Java public methods. They are the whole reason for having a web service.

Coarse-grained interfaces are recommended as a best practice for external integration in SOA.

A standalone web service client leverages client artifacts to make a call on a web service operation. The client first creates a service. It then obtains a port from the service. Then methods corresponding to the web service's operations can be called on the port.