Saturday, 30 June 2007

XML Serialization in ASP.NET

XML Serialization in ASP.NET

In the past, maintaining the state of an object in ASP often required some very inventive and painstaking code. In the brave new world of .NET, however, Object Serialization offers us a comparatively easy way to do just that, as well as some other useful tasks.
As a kid, I remember waking up on many a cold morning and stumbling into the kitchen with my eyes half-closed, looking forward to whatever Mom had prepared for breakfast, only to find an anticlimactic bowl of steaming hot just-add-boiling-water instant oatmeal waiting for me on the table. At least I wasn't like the more unfortunate kids whose mothers force-fed them that white silt of death, powdered milk. I am absolutely certain that something must go seriously awry in the dehydration process of milk because upon rehydration, that stuff is just plain nasty.
Be that as it may, I think that at least one of the developers involved in creating the .NET Framework must have been one of those abused children. I see powdered milk fingerprints all over some of the new data management techniques in .NET. Then again, in an age of dehydrated/rehydrated food products, what could be more logical than dehydrated/rehydrated data?
Object Serialization
The technique to which I refer is called "Object Serialization". Object Serialization is a process through which an object's state is transformed into some serial data format, such as XML or a binary format, in order to be stored for some later use. In other words, the object is "dehydrated" and put away until we need to use it again. Let's look at an example to clarify this idea a little further. Suppose we have an object defined and instantiated as shown below:
Public Class Person
private m_sName as string
private m_iAge as integer

public property Name() as string
get
return m_sName
end get
set(byval sNewName as string)
m_sName = sNewName
end set
end property

public property Age() as integer
get
return m_iAge
end get
set(byval iNewAge as integer)
m_iAge = iNewAge
end set
end property
End Class
dim oPerson as New Person()
oPerson.Name = "Powdered Toast Man"
oPerson.Age = "38"
Let's say that for some reason, we wanted to save a copy of this object just as it is at this very moment. We could serialize it as an XML document that would look something like this:

Powdered Toast Man
38

Then, at some later time when we needed to use the object again, we could just deserialize ("rehydrate") it and have our object restored to us just as it was at the moment we serialized it.
Why All This Dried Food?
"This serialization and deserialization is all well and good," you may be saying at this point, "but what can it be used for in real world web applications?" Very good question. What would be the sense of dehydrating a glass of milk just to rehydrate it and then dehydrate it again without drinking any? Never mind that it'll probably taste like the inside of your little brother's sock drawer (maybe I should have come up with a more tasty dehydrated food for this analogy). Some good uses for serialization/deserialization include:
• Storing user preferences in an object.
• Maintaining security information across pages and applications.
• Modification of XML documents without using the DOM.
• Passing an object from one application to another.
• Passing an object from one domain to another.
• Passing an object through a firewall as an XML string.
• These are only a few of the many possibilities that serialization opens up for us.
The Dehydration Process

"Okay, okay, I get the picture," you may be muttering now, "but how do I do it?" I'm going to walk you through a simple example of serializing an object to be saved to disk as an XML file. Keep in mind that in .NET we can serialize objects into a binary format or SOAP format as well as into XML, but we will focus solely on XML for this article for the sake of brevity. Also, the object in the example is obviously very simplistic and wouldn't be practical in the real world, but it will serve as a means to clearly illustrate how to serialize an object. The same principles used in this example can then be applied to more complicated tasks. Note that all the code used for this example is available for download at the end of this article).
First of all, let's take a look at the class that is the blueprint for our object (this snippet can be found in the file, xmlser.aspx).
_
public class Person
private m_sName as string
private m_iAge as integer
_
public property Name() as string
get
return m_sName
end get
set(byval sNewName as string)
m_sName = sNewName
end set
end property
_
public property Age() as integer
get
return m_iAge
end get
set(byval iNewAge as integer)
m_iAge = iNewAge
end set
end property
public function Hello() as string
dim s as string
s = "Hi! My name is " & Name & " and I am " & Age & " years old."
return s
end function
public function Goodbye() as string
return "So long!"
end function
end class
This looks pretty similar to the class we saw earlier but with a few adjustments. First of all, notice the lines that say,
")
Response.Write("Goodbye() = " & oLucky.Goodbye() & "
")
'Serialize object to XML and write it to XML file
oStmW = new StreamWriter(Server.MapPath("lucky.xml"))
oXS.Serialize(oStmW, oLucky)
oStmW.Close()
First of all, we declare and instantiate an XMLSerializer object. You'll notice that we had to tell it right from the onset, using the GetType() function, what type of object it's going to be serializing. Next you see that we assign values to the Name and Age properties of the Person object we instantiated. Then we output to the ASP.NET page what the properties are set to by calling the Hello() and Goodbye() methods of the Person object.

Remember that this is only so that we can see what's happening with the object during this process. Next comes the good stuff: We instantiate a StreamWriter object and tell it that it will be writing to a file called, lucky.xml. We then call the Serialize() method of the XMLSerializer object and send it our Person object to be serialized as well as the StreamWriter object so it will write the resulting XML to the file specified. Then we close the StreamWriter, thereby closing the file.
That's it. If everything works correctly, an XML file (lucky.xml) will be written to disk. It should look like this:


52
Lucky Day

Notice that the names of the XML elements are exactly as we specified in the and attributes in the class earlier. In Part 2 we'll examine how to "rehydrate" our XML into an object instance.
Just Add Water
Now let's look at this from the opposite angle. We've just seen how to serialize an object into an XML file and save it to disk, but now suppose we already had an XML file saved and wanted to use it to instantiate an object. In the downloadable code found at the end of this article, you will find a file called, ned.xml. We’re going to use that XML file to create a Person object. Its contents look like this:


47
Ned Nederlander

You'll notice that this XML document has exactly the same structure as the XML file that we wrote to disk a moment ago but the data it contains is, of course, different. Now put on your wicked mad scientist grins and look at the code required to bring this beast of an object to life:
dim oNed as Person
dim oStmR as StreamReader
'Pull in contents of an object serialized into an XML file
'and deserialize it into an object
oStmR = new StreamReader(Server.MapPath("ned.xml"))
oNed = oXS.Deserialize(oStmR)
oStmR.Close()
'Display property values
Response.Write("Hello() = " & oNed.Hello() & "
")
Response.Write("Goodbye() = " & oNed.Goodbye() & "
")
Before anything else, we declare a Person object and StreamReader object. Next, we create an instance of the StreamReader object and feed it the stored XML file. Then we instantiate the Person object by calling the Deserialize() method of the XMLSerializer object. This method uses the StreamReader object to read the contents of the XML file and then instantiates an object whose state matches that described in the XML file. Finally, we close up the StreamReader object and then output the results of the newly created object's Hello() and Goodbye() methods just to prove that it was successfully created. It's just like that instant oatmeal Mom used to make.
Note: Something important to remember is that when an object is instantiated through Deserialization, its constructor is not called. Just keep that in mind if you plan on doing this with any objects which are very dependent on their constructors performing some crucial function.
Do I Have To Keep My Raisins?
Perhaps you are wondering now, "Pretty cool - but what if I don't want to save my object to disk?" Another good question. There's no reason you would have to. Let's suppose that for some reason, you needed to serialize an object into an XML string to be used for some purpose and then forgotten or re-instantiated or whatever else. This can be accomplished in almost the same way that was demonstrated earlier. However, instead of using a StreamWriter object in the process, we will use a StringWriter object. See the code snippet below:
dim oDusty as new Person()
dim oStrW as new StringWriter()
dim sXML as string
'Set properties
oDusty.Name = "Dusty Bottoms"
oDusty.Age = 51
'Serialize object into an XML string
oXS.Serialize(oStrW, oDusty)
sXML = oStrW.ToString()
oStrW.Close()
As you can see, we instantiate a new Person object and StringWriter object and then assign values to the Name and Age properties of the Person object. We then call the Serialize() method of the XMLSerializer object and the Person object is serialized into an XML document and placed in the StringWriter object.
Before we move on, it is important to understand some things about the StringWriter and StreamWriter objects and Inheritance. The Serialize() method of the XMLSerializer object is an overloaded method and one of its signatures is: Overloads Public Sub Serialize(TextWriter, Object). This means we must send it a TextWriter object and some other object.
"Wait a minute!" I hear you shouting, "If it needs to be sent a TextWriter object, why are we sending it StringWriters and StreamWriters?" That's because of Inheritance.
In object oriented development, objects can be derived from other objects, inheriting some or all of the original object's characteristics. This is where StringWriter and StreamWriter come from. They are "descendants" of TextWriter. Think of it this way: A man named Fritz Meyer has two children, Hansel and Gretel. Hansel is not Fritz, but he is a Meyer as is Gretel and when they have a Meyer family reunion, Fritz, Hansel, and Gretel can all get in the door because they are all Meyers. Similarly, because StreamWriter and StringWriter are both descended from TextWriter, they can be used with this call to Serialize(). Unfortunately, StreamWriter doesn't have a way to present its contents as a string data type, but StringWriter does and we are interested, at this point, in getting the XML string rather than saving it to a file.
That is why, in the code snippet above, we send a StringWriter to Serialize() instead of a StreamWriter. (For more information on inheritence and how it is used in .NET, be sure to read: Using Object-Orientation in ASP.NET: Inheritance.)
After the serialization takes place, we capture the XML string by calling the ToString() method of the StringWriter object and placing the results in a string variable. Next, we close the StringWriter object because we no longer need it. We now have our hands on the XML string and can do with it what we please. In the downloadable example code, all we do with it is output it to the browser.

Conclusion
As you have just seen, serialization is fairly easy to implement. I've already listed several possible reasons to use serialization in your applications and now that you know how to do it, I will leave the rest to your own capable imaginations. This article has focused only on how to serialize an object into an XML document, but please remember that objects can also be serialized into binary or SOAP formats. To learn more about those types of serialization, look up the BinaryFormatter class and the SOAPFormatter class.
Maybe this powerful technology didn't really have its humble beginnings in the bottom of a glass of powdered milk - but for some reason, it makes me smile to think so. Then again, maybe someday we'll see Bill Gates or one of his .NET guys sporting a liquidy white moustache on one of those "Got Milk" ads.


Friday, 1 June 2007

Transforming Data into Information

Data in SOA, Part I: Transforming Data into Information

Data and data management are key aspects of nearly every enterprise software solution. SOA is no exception. Effective data modeling and management are an essential part of successful SOA realization. To take your data to the next level you need to transform it into information; to take your information to the next level you need to transform it into knowledge.

This article is the first in a series of two articles on “Data in SOA: Transforming Data into Knowledge.” In this article I describe an approach to transforming data into information in SOA as part of an overall SOA transformation plan, with a definition of a SOA Reference Architecture (SOA RA), and the realization of an enterprise SOA. In Part II of this series I describe an approach to transforming information into knowledge for SOA as an extension to an overall SOA transformation plan and a high-value expansion of an enterprise SOA RA.
Why Data?

Data are ubiquitous (data is plural; datum is singular though both plural and singular verbs can be used with "data"). At their core, most IT efforts are focused on collecting, distributing, and managing data, providing data when it's needed, where it's needed, how it's needed, and for whomever (with proper authorization) needs it. Some may recall that long before the term IT ("information technology") was coined, most enterprises called their "computer departments" and activities DP, or “Data Processing.”

With all the technology waves past, present, and into the foreseeable future, one constant has remained: data. The same data that were (and still likely are) processed by mainframes have also likely been processed by one or more of client-server, CORBA/DCOM, Java EE, .NET, Web services, SOA, and Web 2.0. Over time, the storage, formats, and transports may have changed, and how the data is processed has changed, but the "data" remain (and are growing). In essence, all the industry technology waves have one thing in common: they are new or improved ways to process data. Data are fundamental. If you agree with my premise that data are fundamental to enterprise solutions, it follows that data (and data modeling/management) are also a priority consideration for enterprise architects in SOA (and Web 2.0).
What are Data?

Let's start by selecting your favorite dictionary definition for "data," and then augment it. For the purpose of this article, data are the elemental, atomic, or low-level aggregation of pieces of "information" with some structure (form), relations, and state, but no behavior. For example, an Address table with columns for Street Address, City, and so on, is an example of data, as is the definition of an Address in a Customer Table. Data are structure and state without behavior. Data are the raw building blocks from which we may construct information. Data are the prerequisite for Information.
What is Information?

Again, choose your favorite definition, and then augment it. For the purpose of this article, information is the aggregation of data and the fundamental logic that provides additional form, the basic relations, and syntactic and semantic contexts—that is, it is state and core model behavior. For example, correctness in ensuring a ZIP code is valid and consistent with the City. Information extends data by providing the ability to map, or relate, data, and define logic for the behavioral models consistent with the domain (syntax and semantics) context. Information is based on and requires data. In other words, information represents entities (subjects, objects) that encapsulate both state (data) and behavior (logic). You may consider information as being analogous to an instance of a model class in object-oriented programming which contains both data members (instance variables) that hold state and methods that provide (model) behavior.
The Value of Data in SOA

Organizations have different drivers, starting points, and priorities for defining and refining their SOA Reference Architecture (SOA RA), which may shift during their transformation to SOA. A holistic approach to the planning and design of a SOA RA should include the data services layer. This article uses the term data services layer to include both data and information access services.

Without an enterprise data services layer in your SOA RA, subsequent line-of-business (LoB) projects will be forced to develop individual "point," or one-off solutions, that are specific to each application. Few commonalities will be discovered, few opportunities for shared service definition, reuse, and consistency will be discovered, and the definition of a canonical data model will be elusive. There is a good chance that many of the benefits of SOA (and ROI) will take longer to realize, if they are realized at all. We’ve probably all read statistics that place project resource consumption on data integration tasks at anywhere from 50 to 85 percent of enterprise application software development! This anecdotal "fact" alone should be enough to ensure a data services layer is an integral part of any SOA realization. Combined with the obvious notion that our enterprise software solutions are primarily designed to process data, the value of data in SOA should also be apparent.

Figure 1 is a high-level conceptual view of BEA's SOA Reference Architecture, which illustrates high-level layers. Note the presence of the data services layer as first-class area, indicating the importance of the data services layer in a SOA RA.

BEA SOA Reference Architecture
Figure 1: SOA Reference Architecture layers

Data, data models, and data management are fundamental to SOA success. In fact, BEA values data services so highly that not only do we offer the AquaLogic Data Services Platform product, but data services are a fundamental part of many BEA Consulting service offerings, which include a Data Services Consulting Service where the focus is on SOA data and information layer planning, design, and development.
A Note About Data Access and Connectivity Services

Data access services refer to information sources often collectively known as Enterprise Information Systems (EIS) as well as databases and file systems. These can be legacy systems, systems of record, packaged commercial applications, customer, partner, and third-party applications and services, and Web services. What they have in common is that they provide data and/or information (which implies behavior in the context of this article) for consumption by other applications. In this sense, these applications when accessed through the data services layer are just another form or source of data. At a higher level of abstraction, Data services would look the same to consuming applications, which is one of the primary goals (normalization/consistency) of the data services layer in SOA RA. The fact that the interface exposed for consumption interacts with one or more databases, tables, back-end, legacy, shrink-rapped, and/or external systems is an implementation detail encapsulated by the data services layer.

Connectivity services are about exposing applications and databases as application services in a standards-based manner.
Transforming Data into Information

So, your organization is planning a transformation to SOA. Investigation and planning on all layers and aspects of the SOA RA (see Figure 1) has started, and you have been tasked with the realization of the data services layer. Now what? Consider the following transformation steps:

1. Inventory existing data and system access assets
2. Determine dependency matrix
3. Establish baselines metrics/SLAs
4. Set asset priorities
5. Carry out data modeling
6. Create logical modeling
7. Set information rules
8. Establish application specializations

Figure 2 provides an example of a possible set of internal abstraction layers for an SOA RA data services layer where we will map the requirements and capabilities from our 9 steps:

Data Services Layer – Internal Layer Abstraction

Figure 2: Data services layer –internal layer abstraction

Based on your requirements and perspective, you may determine the need for a different set of abstraction layers. At the very least, you should separate the physical and logical layers and distribute your rule types accordingly.

Let's now look at each of these steps in more detail.
1) Inventory Existing Data and System Access Assets

The first step is finding out what is out there, that is, what are your current data and information system access assets. What data and information assets (referred to as simply "assets" for the remainder of the article), for example databases, information sources, and applications (meaning legacy, system of record) does your organization have? For each asset you will want to know the supporting metadata such as documentation, history, technology/tools/products/platforms, versions, ownership/management, location, security, and access mechanisms. Depending on the number of assets and their metadata, you may want to consider some sort of metadata catalogue or repository as well as a standard template or set of templates that captures the meta-information in a consistent manner and allows for search.
2) Determine Dependency Matrix

Once you have started or created the asset catalogue, the second step is to determine the dependency matrix. The dependency matrix, also part of the asset meta-information, captures information on who uses the asset, when they use it, frequency/how often, what they do with, or to, the asset (for example, CRUD), where they use it (that is, what type of access—batch, online, real time, reporting). It is also important to understand why a consumer uses a particular asset as that will help with task prioritization as well as provide requirements for your emerging data models.

Once you have captured the "who, what, where, when, how, and why" for each known consumer of an asset, you can start to analyze and form generalizations across all asset consumers. The goal is to find opportunities for simplification and reuse by transforming existing assets into SOA Building Blocks. These include, but are not limited to, assets in a service-oriented, self-describing, discoverable form that can be readily utilized in an SOA ecosystem using open, common, industry, and/or organization standards.

One definition contained within the set of SOA Building Blocks is your definition of a service. What standards and specifications, and their versions, will be used? For example, specific versions of WSDL, SOAP, UDDI, WS-Security, WS-I Basic Profile, WS-Addressing, XML, and XSD may be required, while others may be optional/recommended. Your data and information access assets will likely take a form consistent with your basic SOA Building Block definition of a "service." (Using your favorite search engine, search on the topics of “Service Identification” and “Service Definition,” which cover this area.)
3) Establish Baselines Metrics/SLAs

Each catalogued asset, since it already exists in some form, should have estimated or actual production usage statistics, including transaction volume, patterns, concurrent users, reliability, availability, scalability, and performance (RASP) information.

Usage information is also a great indicator of business and IT value and priority. This baseline information is used to define a set of metrics that will form the basis of Service Level Agreements (SLAs) and allow for goal definition and tracking over time. Metrics, as well as current production information, are invaluable in sizing and capacity planning of both hardware and software to support the data services layer in SOA. Be sure your SLAs are bidirectional, that is the service provider defines its SLA terms, conditions, and penalties for each consumer; consumers are expected to abide by the agreement.

For example, an agreement states that Consumer A may perform a maximum of 100 get() requests on DataServiceXYZ (the asset/service provider) per day (where a day is defined as a 24-hour period starting at 12:00 midnight GMT) and the response time per request is to be <= 2 seconds. If Consumer A sends more than the agreed maximum get() requests, then the service provider is able to apply the penalties as defined in the agreement. There are corresponding expectations on the service provider. Should Consumer A stay at or beneath their request maximum, the service provider must provide a response time <= 2 seconds or face commensurate penalties defined in the agreement.

Metrics and SLAs define the expectations and rules of engagement that affect the basis of the value, goal, and sizing of each asset. Track your baseline metrics, SLAs, and reuse to establish a cost and benefits model.

With the preceding set of information captured to some degree, it should be possible to start evaluating each asset in the context of all the other cataloged assets—that is, assign each asset a priority. A good heuristic is to have at least three and no more than ten (which is excessive) priority levels; any more or less will be inadequate or unmanageable.

Priority assignments are designed to assist in the identification of the most important assets based on utilization and the value of the business functions supported. You should design a set of metrics (including those in Step 3) and definitions that provide for empirical comparison and evaluation of each asset to determine its priority assignment. Assigning asset priorities will help determine possible project starting points, potential business/IT sponsors, and relative business value.

Using all of the preceding information, a "current reality" snapshot for each asset can be established, documented, and tracked as these assets are transformed into SOA building blocks. Across all catalogued assets, the top-rated highest priority assets should be selected for the remaining set of steps. The actual number selected depends on your risk assessments, priority valuation, business/IT goals, resources, and similar factors.

5) Data Modeling

Starting with the first selected asset (I recommend doing one asset end-to-end first, perhaps not the highest priority either, as this allows you to exercise the governance and data services layer’s SDLC process in a more controlled and manageable manner), review the existing physical aspects. For a database or set of tables, consider the various queries that are used by consumers, any logic procedures stored in the database, and their triggers, as well as any side-effect actions. This forms the physical data asset definition and description. For information access, what is used: MOM, third-party adapters, or proprietary integrations, point-to-point custom integration? This forms the physical information asset definition and description.

As the data services layer forms an integral part of an overall SOA Reference Architecture, the definitions and requirements for an SOA building block should be defined. There is likely a gap between your asset's current state and the SOA RA building block goal state. The first order of business is to bring the current physical asset as close to your SOA Building Block goal state standard as possible. You may recall the previous discussion regarding the definition and description of a "service" for your SOA Reference Architecture. For simplicity, let's say your definition of a service requires WSDL, SOAP, document-style with documents defined using XSD. Other recommended specifications include WS-Addressing, and XQuery/XPath. With this definition, we need to consider how to transform or map tables in a relational database, XML data, and/or information access systems into a set of services that meet our building block service definition criteria.

There are various tools and technologies to map existing data and information access assets into a physical data layer in Figure 2 to define logical service models consistent with your specific requirements and definition of a service. BEA's AquaLogic Data Services Platform (ALDSP) is our realization technology for transformation of data/information access assets into SOA building blocks (data services), which provides a standards-based, service-oriented data services layer for your SOA Reference Architecture.

Once you import your physical assets (regardless of their interface and implementation), you have what is known as the physical data services layer (refer to Figure 2). Services in the physical data services layer have a consistent look, feel, and representation—that is, the underlying implementation details and communication protocols are abstracted, encapsulated, and removed from view (and you may still go "under the covers" when required), providing only the asset definition (service definition) and operational information. Now that you have your "data," it is time to define your logical model.
6) Logical Modeling

The goal of the logical model is to abstract, integrate, normalize, and manage the aggregation of one or more physical data services. These actions may be abstracted into two logical layers: the logical data normalization layer and the logical data integration layer, as shown in Figure 2, which also have a set of applicable rules: management rules, data rules, integration rules, and business rules.

Before we go further, it is worth noting that ALDSP allows for any number of logical layers that are required to support your logical abstraction design requirements. The logical layers are design-time-oriented only; their purpose is to allow designers and developers to separate and layer logical models and concerns effectively. These logical layers are not part of the runtime deployment—that is, even though there may be several logical layers in design, they do not correspond to a set of indirection layers at runtime. They are flattened and optimized into a single runtime layer. Development and operational staff can view the runtime artifacts and the optimizations and make modifications as they deem necessary.

You may define a different set of criteria and factors as the basis of your logical model layers than the ones I use here. For example, there may be a single layer that contains all of your logical abstractions, or you may have several logical layers. Too few logical layers may prove to be limiting and potentially lead to an increase in complexity over time. At a minimum, you should define a set of criteria that determines your logical abstraction layers and what they contain.

For example, you may have a logical abstraction that performs the normalizations as I show in Figure 2. The logical data normalization layer allows you to "clean up" and simplify any complex or confusing information. It is often difficult if not impossible to change the physical structure of existing databases or other systems over which you do not have direct ownership or responsibility, or changes at that level are simply not practical. The logical data normalization layer provides this opportunity to reengineer without forcing changes in the physical data layer. (If you need more information on "data normalization," I recommend performing a Web search on "data normalization" to learn more about what that is, and what it entails.) The logical layer provides a model design that may be used as a future physical data and information model as the systems that use the data sources directly are updated or retired. The goal of logical data services is provide a service model that is much easier to use, more understandable, and potentially more reusable by higher-level shared services and consuming applications.

Steps 5 and 6 may be reversed. The key is to ensure your logical models are not overly constrained by the current physical assets. In other words, while your logical models will utilize physical data services, do not let the limitations of those current physical assets limit your logical models or exert undue influence on your overall data services layer design. The physical assets are a starting point upon which to build richer, more expressive models.
7) Information Rules

Rules and rule processing are how data become information. Rules and rule processing provide relations, semantics, and behavior in the data services layer. As shown in Figure 2, there are several categories of rules:

*

Management rules provide any requirements and/or restrictions on using the system and data assets that form the physical data layer. This can include security, access windows (dates/times), caching, metadata, transactions, and any side effects or ancillary actions (for example, logging and auditing) that need to be performed.
*

Data rules provide validation, consistency, cross-checking, and any other rules associated with data accuracy and consistency. They may also provide cache management and other side effects in the physical or logical models. Data rules are at the table, row, column, and field level.
*

Integration rules provide mappings and consistency across logical and physical data layers. Integration maps higher-level abstractions to their corresponding logical or physical layers. For example, a Customer ID at a higher-level abstraction as part of a new canonical data model that is converted from/to several underlying native forms from several customer databases and/or backend systems. Integration rules are at the system and/or database layer.
*

Business rules provide meaningful business relations and some business logic, that is, behavior. In object-oriented programming, consider the state and behavior encapsulated in your model objects. Business rules perform a similar behavioral role in data services. Business Rules capture business processing logic at the data model layer. This logic is fundamental to the business entity’s very definition and its relations with other business entities that are intrinsic to the business entity across all utilizations, for example, in an enterprise-wide, or at least a division-wide, scope. Some of these rules are defined in the canonical model, while others are defined in the application specialization models.

8) Application Specializations

Once you have completed your logical model, you have effectively defined a canonical information model. The definition of this model completes the initial design of your information model, meaning you have effectively started to transform your data into your information. There is one final step that further refines your information model: application specializations.

Though many may, not all consuming applications will be able to use the canonical information model directly. Application specialization provides an abstraction layer for consuming applications to define their own logical model specific to their requirements.

Application specializations encapsulate the additional information model state and behavior required by consuming applications, which simplifies the consuming applications' utilization of the canonical information model assets. Since application specializations are unique to each consuming application, or a set of related business applications, there is no need to include them in the canonical information model. If application specializations have a larger scope (for example, across divisions or the enterprise), then they should be part of the canonical information model.
Conclusion

Creating the data services layer for your SOA Reference Architecture and defining the canonical information model for your organization is a difficult, challenging task often with little glory: it is difficult work, and challenging to do well. Following the approach described in this article should provide enough information for you to plan, assess, and begin designing your SOA transformation in the data layer, and transforming your organization's data into information. The actual planning, design, and development of your SOA Reference Architecture's data services layer depend on a number of unique factors that are specific to your organization or situation and well beyond the scope of this architecture article.

Now that we have started transforming our data into information in our SOA ecosystem, we can think about transforming our information into knowledge. The second and final article in this series, "Data in SOA, Part II: Transformation of Information into Knowledge," will describe the steps for this transformation.
References

Building Interoperable Insurance Systems with .NET 3.0 Technologies

Building Interoperable Insurance Systems with .NET 3.0 Technologies


Mike Walker

Microsoft Corporation

December 2006

Applies to:
Microsoft .NET Framework 3.0

Summary: This white paper will use an insurance-industry scenario to demonstrate interoperability capabilities of the Microsoft platform. Using protocol-level standards alone is not enough; capturing the business side of the messaging transactions is key to making interoperability work for your business. This is true across all industries, not just insurance. (15 printed pages)
Contents

Introduction
Insurance-Industry Forces
Business Terms Used in This Document
Life-Insurance Policy Scenario
Architecture Overview
The Insurance Agent Policy System
The Insurance Carrier Systems
What Is the Value?
Conclusion
Resources
Introduction

The purpose of this white-paper series is to provide guidance around integration challenges.

Through this white paper, we will use an insurance-industry scenario to demonstrate interoperability capabilities of the Microsoft platform. Through maturity of many enterprises, we live in a world where there is more than one stack of technology. These platform stacks range from legacy mainframe-based COBOL or FORTRAN types of applications to the more modern solutions based on .NET, Mobile Systems, or Java—and everything in the middle. As a result, as enterprises have iterated through technologies and technology trends, there has been more than a few bandages applied to the various technologies.
Insurance Interop Series

This white paper will serve as a guide for architects who are facing integration challenges in the insurance industry. We will show you how to use Microsoft integration technologies to integrate disparate systems in your enterprise. Additionally, this document will provide pragmatic design guidance for building interoperable solutions using open standards such as WS-*. Additional documents in this series will include the following:

Architecture Overview to Building Interoperable Insurance Systems

Securing Insurance Solutions

Scaling and Operational Management

Deploying Enterprise Solutions

Developing Composite Applications

Technologies that will be covered include:

1. BizTalk 2006. The integration technology for this solution. The solution also uses the BizTalk business rules and workflow orchestration.
2. Windows Communication Foundation (WCF). The programming model to develop Web service messages and manage protocol-level communication by using the WS-* protocols.
3. Windows Workflow Foundation (WF). To create compelling workflows using smart-client technologies.
4. SQL Server 2005. The repository for all of the application and customer data.
5. Windows Server 2003. The server platform.

This scenario will give us a glimpse into the business process. Like many businesses, each insurance company has its own unique way of handling its process. However, there are some similarities that these businesses share at the platform level. The purpose here is to demonstrate that there is a way to leverage these common platform services to build Service-Oriented Architectures (SOA) giving an organization more agility with the processes that differentiate their specific business.
Insurance-Industry Forces

In the insurance industry, there are many technologies at play, ranging from mainframe to UNIX to Windows. With this wide range of platform technologies, it is increasingly difficult to manage and operate while trying to be agile in an ever-changing financial market. For years, organizations have been building and buying technologies to meet these needs. Interoperability has become a necessary evil after the solution has been built and/or implemented. This has left us with point-to-point integrations that address very specific problems only at the application or system level, but not at the business-function level.

Figure 1. The result of point-to-point integrations

If care is not taken, point-to-point integrations over many years result in:

* IT portfolio management becoming unmanageable, given the duplication of systems, multiple variations of integrations, management of dependencies of applications, and so forth.
* Increased cost of IT systems, dramatically rising because of the number of custom integrations.
* Loss of agility, because development of systems is slowed significantly as a result of increased code complexity, limited reusability, and lack of standardization in the enterprise.

So, what does this mean to many insurance carriers? It means that interoperability is at critical importance—not only as an efficiency issue, but also as a competitive differentiator. In these days of modern competition, companies must increase the return on investment (ROI) of their IT systems by streamlining the processes and becoming more agile to stay competitive.

Our goal is to address the industry challenges with a set of enterprise-ready technologies on the Microsoft platform. We use the following principles in the examples:

* Enterprise-class solution
* Standard communications:
o Use WS-* standards
o ACORD messages
* Ensuring interoperability with existing solutions

Business Terms Used in This Document

ACORD—ACORD (www.acord.org) is a nonprofit association whose mission is to facilitate the development and use of standards for the insurance, reinsurance, and related financial-services industries.

Order system—Creates requests for external data, transmits them to the appropriate third-party data provider, manages responses received, and matches responses to the appropriate original requestor.

Third-party service provider—An external system to fulfill an underwriting requirements request (for example, a credit-rating system).

Underwriting process—Implementation of the business process for assessing and processing a new business.

Broker system—Possible smart-client front-end system for order entry and progress monitoring used by an insurance broker. Other front-end systems are also possible, such as a Web portal to brokers, or a Web UI for self-service order entry by customers.
Life-Insurance Policy Scenario

The customer, Robert, wants to purchase a platinum-level 1 million dollar life-insurance policy. The broker, Tom, enters Robert's policy application using his smart-client application. The policy is sent to the Order system, where it is processed and routed to the appropriate systems to begin the underwriting process. While in the Order system, third-party services are kicked off. For this scenario, we will use a Paramed, a third-party service that verifies insurers' health insurance and medical records.

The built-in business logic can also generate requests to third parties if a certain condition is met. This could be the broker or another partner of the insurance company.

Figure 2. Business process used for our scenario
Architecture Overview

This section will walk through the high-level logical architecture used in the scenario. The details around specific aspects such as security, messages, development, and deployment will be provided in other papers in this series.

To ensure applicability with real-world challenges, we derived a set of high-level requirements.
Requirements

The following requirements are for an enterprise-class solution:

* Must interoperate with existing, commercial off-the-shelf applications. As discussed earlier, many organizations purchase and customize software. It is critical to address this.
* The integration technology must be Web services. Many forms of communication, such as binary communication, are proprietary. Until the emergence of Web services, there was no standardized way to communicate messages. Web services provide a way to communicate across heterogeneous platforms.
* WS-* standards must be used. Web services using SOAP and WSDL have been industry integration standard for years. However, these traditional Web services lack the robustness needed for messaging. The WS-* standards provide these necessary features without the usage of binary communication.
* Long-running workflow. Management of long-running orchestrations has been difficult, especially when that workflow spawns many smaller external workflows, in which case reconciliation and transaction management can become complex.

We use BizTalk as the message hub for this solution, given its rich capabilities and the strong need this insurance solution has for tying multiple systems together and managing multiple external workflows.

Figure 3. Using message-bus technology

Shown in Figure 3 is an enterprise view of BizTalk as an enterprise service bus (ESB). Remember that it is not a requirement that this is used as an ESB. This white paper refers to this layer as just a message layer, so that you can incorporate it into your solution in either case.

The rationale for using BizTalk is that it provides a centralized platform for the following capabilities:

* Business-process management—Centralizing reusable business process not only lends to service orientation, but also provides a mechanism for organizations to augment existing or purchased commercial off-the-shelf–based (COTS-based) applications without the complexity of modifying them.
* Workflow orchestration—Management of multiple workflows can be simplified through this platform. Instead of coding or reconciling each workflow, solutions can be managed as they should. We do this by creating one workflow to manage the business process from the beginning to the end that is able to orchestrate multiple internal system workflows.
* Rich adapter support—Jump-starting development is critical to organizations. BizTalk has a wide range of adapters to support your integration needs. In the insurance space, there is an ACORD adapter that can jump-start your integrations. In conjunction with the ACORD adapter, the Web Services Adapter and File-Based Adapters are available for BizTalk.
* Message routing and transformation—Message routing can be very complex when the messages must be transformed so that other systems understand the message. BizTalk can provide a platform to reduce the complexity and still align with open standards.

The Insurance Agent Policy System

Currently, the technology trends in the insurance industry vary from portals, thick clients, 3270 mainframe terminal-emulation screens, and smart clients. Given the diverse number of applications and vendors in this space, we choose a smart-client user interface (UI) to provide the optimal experience for the agent for the following reasons:

* Offline and online modes
* No dependencies on network connectivity
* Rich user experience with much greater functionality

A disconnected model for agents makes sense in many situations, as brokers can often be mobile or have limited connectivity to network resources. However, because we will be using Web services as the core of our messaging strategy when architecting this solution, the manner in which the end broker submits policies should be trivial.

For the client-side architecture, we used Windows Forms as the user interface, which provides the user interface needed for the agents. There will be several controls, such as data grids, text boxes, and command buttons. A data grid on the Windows form will serve as the window into the policy pipeline for the broker. We use Web services to update this data grid to ensure real-time updates.

Because this is a smart client, returning data can be cached for offline viewing and updating. This provides significant benefits to the brokers. In addition to the data, a small layer of business logic would reside on the client application. The majority of the application logic will reside on the insurance company side. The rationale here is that we will have light rules to drive UI functionality.

Figure 4. Client logical architecture

To make the calls from the client to the messaging tier, we will use Windows Communication Foundation (WCF). WCF will send SOAP 1.2 Web services messages using the ACORD messaging schemas. The WCF layer will provide a unified development model for our developers when coding communications. From the protocol perspective, we will use a series of WS-* standards. However, this is not enough to ensure interoperability. Usage of the ACORD industry standards is key, too. We should be able to interoperate seamlessly between "homegrown" applications, COTS applications, and third-party services.
Messaging Architecture

The use of Web services enables this broad variety of channels to leverage a common Web service that receives new business applications into the underwriting process in the form of an ACORD 103 message that includes a policy number that has been assigned, and that will be used for tracking/correlation purposes throughout this demonstration. This ACORD 103 New Business Submission message will be based on a SOAP Message Transmission Optimization Mechanism (MTOM/XOP) attachment containing the binary representation of Robert's signature to authorize release of medical information, as required by HIPAA. It is absolutely critical that the ACORD standards are incorporated in our messaging. This will ensure portability of the architecture.

It is also essential that communications are secure and reliable. To achieve, this we will use WS-Secure Conversation (WS-SC) for personal information that might pass through an undetermined number of intermediaries. We also use WS-SC for high-volume, frequent requests (such as credit checks) that will be required for all new policy applications. We use WS-Security for less frequent requests, such as an Attending Physician's Statement (APS), where the overhead of session establishment is not justified by the request volume. We also use TLS/SSL (also known as HTTPS) in rare cases where a service is directly processing requests without any intermediate routing.

For messaging where tracking reception is important, such as ensuring receipt of a new policy to claim a commission, we use WS- Reliable Messaging (WS-RM). We also use WS RM for data requests that are expensive to process (typically involving human workflow, such as APS queries). This ensures that requests are only delivered once, and avoids expensive duplicate requests.

For long-running messages, we use WS-Secure Conversation (WS-SC) (See Resources.)

Figure 5. Client message-exchange patterns
Transaction Business processes WS-* protocols Architecture decision
Submission of new policy (103 Request) Broker-client

Underwriting process
WS-Security (WS-S)

WS-Reliable Messaging (WS-RM)
WS-S used for personal information that might pass through an undetermined number of intermediaries.

WS-RM used to track message receipt.

Because of infrequent transactions, no need for session-oriented security mechanisms, such as WS-Secure Conversation.
Status queries (122 Request/Response) Broker-client

Underwriting process

Fulfillment process
WS-Secure Conversation (WS-SC) Noncritical and individual request or response messages that can be retried easily, but still contain personal information.
Underwriting requirement order request (121)

Underwriting requirement order response (1122)
Underwriting process

Fulfillment process
WS-Secure Conversation (WS-SC) or WS-Security (WSS) or Transport-level security (TLS/SSL)

WS-Reliable Messaging (WS-RM)
These messages contain personal information.

WS-SC will be used for high-volume, frequent requests (such as credit checks).

Use WS-Security for less frequent requests, where the overhead of session establishment is not justified by the request volume.

Use TLS/SSL where a service is directly processing requests without any intermediate routing.

Use WS RM for data requests that are expensive to process.

Table 1: Business-process messaging design decision matrix

You might ask yourself, after making a submission: Why is the status returned in a separate transaction? Well, the reasons are twofold. Firstly, it is important for this to be asynchronous, and the ACORD standard does not allow an implementation without separating the status from the submission. Secondly, the Broker will be getting status returns periodically through the course of the application process by querying the Status Service.
The Insurance Carrier Systems

When architecting the server side of the solution, there were particular aspects and assumptions considered:

* This architecture accounts for fragmented systems.
* Functional areas are self-contained and need to be managed.
* Operating systems and development environments differ.

As a result, there are a significant number of point-to-point integrations with very specific applications, thus causing proprietary implementations. In this solution, the façade layer will be created around these existing applications.

Figure 6. Insurance message bus

Here, you can see how we are using the enterprise service bus (ESB) as a message bus. This layer will serve as the centralized messaging layer that will manage our internal and external messages. Management and orchestration are key benefits of this architecture.

An infrastructure like this can bring order to the chaos of disparate point-to-point integrations by putting intelligent, long-running orchestrations and policies around transactions in one layer instead of many. It would be common to have several distinct COTS-based applications in upwards of five or six systems to accomplish an end-to-end transaction. We are reducing these systems significantly by consolidating the redundant functions, such as workflow and messaging, leaving infrastructure-level functionality where it belongs and keeping the business logic in applicable applications.

It is important to remember that this message bus is a logical representation. The implementation view can look very different. For example, the message bus could be several BizTalk servers, or there could be servers in different DMZ environments to manage both internal and external communications.

Click here for larger image

Figure 7. Workflow designer (Click on the picture for a larger image)

The next tier down, which is where specific business functions are performed, contains two different legacy systems wrapped with an interface: the Order system and the Fulfillment system. The reason that we are keeping these as separate systems instead of consolidating them is that, the majority of the time, these would be two separate COTS-based systems.

A Status system was added for the following reasons:

* To provide a centralized way to report status to the agents.
* To reduce the number of interfaces and control logic needed to query multiple systems.
* It fits nicely with the orchestration capabilities of our ESB for our long-running workflow.

The Ordering system and the Fulfillment system have been converted into course-grained services. By doing so, we have removed the dependencies of independent implementations. All communication that occurs to these systems now goes through our message hub. The exposed Web service endpoints that are managed from the message bus can then be managed with orchestration technologies built in BizTalk.

Figure 8. End-to-end message-exchange pattern

Now that these applications are exposed as Web services, any technology that can accept Web services XML can integrate with these applications. This removes the tight coupling of other technology protocols that would limit interoperability. For example, you could just as easily use existing Java-based systems, if those were your legacy systems.

SQL Server is used here to store application data in the database layer. Because the core focus of this paper is integration and composite applications, we will not highlight this.

The third-party services referenced are external services that are called by the Fulfillment service. These services have varying protocol needs. However, this paper will show how WS-* standards can provide increased functionality for your services. It is important to note that many of the real-world insurance third-party services only support XML-based communications, not the more advanced SOAP-based Web services. The messaging-architecture sections that follow will have more on the third-party services.
Insurance Carrier Messaging Architecture

This section walks through a basic life policy that is processed by the insurance carrier. Based on the information Robert supplied, the business rules/heuristic logic defined in the underwriting process decides that an Attending Physician Statement APS (that is, a physical) is also required.

Because another provider must fulfill this request, the Order system builds an ACORD XML TransType 121 General Requirements Order Request transaction (TXLifeRequest) and transmits it to a secondary external ordering system for Robert's physician (the APS system). This message also contains the MTOM/XOP attachment of Robert's signature that was originally carried on the ACORD 103 New Business Submission, authorizing his physician to release his medical information to the insurance company.

At some point, Robert's physician will process the APS order by verifying that Robert's signature matches the one he has on file, and then examining Robert's medical history, filling in the necessary information required on the APS report.

After the physician has completed the APS report, an 1122 General Requirements Status/Results Transmittal message is generated and transmitted back to the endpoint reference specified in the WS-Addressing ReplyTo specified in the previous ACORD 121 request. This message will also be delivered reliably using WS-Reliable Messaging.

The rest of the business process runs, including any automated-actuary decision. However, in this case, because there is an APS and possibly some additional information that cannot be processed automatically, the case is flagged for an underwriter's review and approval.

Figure 9. Underwriting process message-exchange pattern
Fulfillment Service

In the insurance industry, a fulfillment system or service is very different from the process of fulfillment:

* Fulfillment system: A system or service that receives a request and fulfills it. Think of a fulfillment service as an integration component for gathering data. In this scenario, the fulfillment system is responsible for pulling the various reports from third-party providers.
* Fulfillment process: The process in which a policy is issued by the insurance carrier.

You might ask why we kept the fulfillment service. For this scenario, we are assuming that systems such as these are purchased as black-box solutions. This is not to say that you could not remove these layers and incorporate them into a message bus, instead.

Choosing a messaging pattern is not as clear-cut as choosing one set of standards. When designing this part of the solution, we had to take a step back and look at the business and legacy aspects of each individual transaction.

Some transactions, such as receiving a credit report, were easier decisions. However, other transactions, such as pulling an APS report, required the ability to contain attachments.

Here are some aspects to consider when designing you messaging:

* Understand the business process. It is critical to understand how the business uses these messages (for example, securing data). If the data being sent is not sensitive, you do not need to take extensive security precautions for the message.
* Understand how the transactions are consumed from the service providers. This can be both internal and external. Many times, when relying on service providers, there are technical limitations. These can range from standards support to hours of operation.
* Give proper attention to security. This is often overlooked. Protocol-level security, such as SSL/TLS, often is sufficient, but not always. Make sure you evaluate the sensitivity of the data and review the message paths to determine how many endpoints there are before the ultimate consumer.
* Be realistic and pragmatic. When designing these services, do not go overboard trying to use every standard. Do not force a standard into a message, if it does not belong. This will only introduce unneeded complexity.

Figure 10. Fulfillment system message-exchange pattern
What Is the Value?

We talked quite a bit about the Microsoft platform and development technologies by walking through the scenario. We also highlighted architecture decisions. But what we did not do is highlight the features of these Microsoft technologies.

The following are the core benefits using Microsoft technologies in the insurance industry:

* Business-process automation—Business processes are complex and specific to each carrier. With the orchestration tools provided in BizTalk, orchestrations can be developed by business analysts, removing the developer from this process and enabling the business.
* Reduction of integration code—With the custom adapters in BizTalk and the unified programming model of WCF, the code required to integrate systems is drastically reduced.
* Alignment with standards—WCF and BizTalk are based on Open XML standards out of the box. No more custom coding to incorporate Web services standards.
* Productivity—With an integrated Visual Studio IDE and .NET 3.0 technologies, both the tools and the development language provide substantial productivity gains over other languages.

Conclusion

As demonstrated in this white paper, using protocol-level standards alone is not enough; capturing the business side of the messaging transactions is key to making interoperability work for your business. This is true across all industries, not just insurance.

We have Web services standards, but that is not enough. There is still a level of due diligence that is required to make the optimal technology decisions for your organization. With this specific reference implementation, we go through a real-world scenario and determine the optimum messaging with this scenario's business forces. This can serve as a guide to help you choose message-exchange patterns in your enterprise. With all architectures, there are trade-offs when choosing specific standards. It is important to understand these trade-offs and be willing to take on the resulting implications.

Microsoft is committed to making the job of architecting and developing service-oriented solutions easier for its customers. And we show here that Microsoft has removed many of the industry barriers and complexities that daunt customers today. These range from providing thought leadership in the industry standards to automating and building out-of-the box Web services support.