The Background:
The convention dictates that the exception handling in
a typical OOPS should be centralised and all the errors would need to be
converted or translated to human readable form.
This is often true because of the fact that the user would get the
opportunity to correct the data and hence recover from the error scenario. This
requires the developers to catch all the possible exceptions, convert the
technical error into human readable form and send across different layers cohesively
until the user sees in the view layer.
However, this topic will let us analyse in detail the
possible error handling principles in the SOA world and will leave the decision
and the best practice to the choice of the readers themselves.
The
Importance of Loose Coupling:
The services developed in the SOA world are aimed to
be atomic to perform a single task of the underlying business and hence
promoting the reusability. At the same
time, we need to remember that these services could be accessed or triggered by
one off the below ways:
- By a batch process (In a regular interval)
- By another web-service (Used by another application Soap or REST based)
- By an orchestration service (A service that calls 2 or more services in a definite sequence)
- By a Business-Event (A special criteria formed based on 2 or more data)
- By a Rule-Engine (A special service that takes series of decision based on service calls)
- By a Messaging System (A service call based on the incoming event/message placed in the Queue)
This is the key in having the service design with the
loosely coupled architecture. Look at
the Actors involved in the above services invocation; it could either be a
system or user or event. Do we expect these
actors to correct the error and call the service back? No, it is not the best that the system
corrects every error by altering the original message and re-issue or re-invoke
the same service to get it processed successfully. But the major focus in SOA is the service
enablement such that it can be used by any services, any technology, any number
of times.
How do we achieve these loose-coupling and yet have
the best exception handling strategy. Let us analyse our options.
Introduce
Consistency in Error Handling
Considering the services being invoked
from any boundary (across and outside the Enterprise) and triggered by any
actor (system or event), it is important to maintain the consistency in
indicating the errors.
The consistency can be achieved by the
below key principles:
- Use of Common Error Codes for all services
- Optionally provide the error descriptions
- Uniform Message Structure (Canonical)
Consider the below sample of such a ‘Fault’ Message:
<error>
<code>10000</code>
<type>Network
Communication</type>
</description>Service
or Network Access Error</description>
</error>
Centralize
the Error Handling
The centralization in SOA would mean to place the
error identification logic within the individual services but decision making
around the error using the isolated and centralized logic via a dedicated
service or set of services. This is
important so that the exception handling code is not duplicated inside
different services in a redundant way.
Where there is a duplication of logic, more the possibility for overhead
of maintenance, defect fixing, testing and in turn more errors.
Connect
the Business to Error Handling:
It is important for every enterprise organization to
analyse the errors occurred in the specific period of time for the below
reasons.
- To identify the pattern of errors occurred.
- To identify the frequency of errors occurred.
- To identify the services causing the errors.
The above listed paradigm helps the organization to
evolve SOA as a continuous improvement of the different systems that in turn
evolved over a period of time. This also
helps to build a strategy around the results, find the root cause and thereby
taking effective decision and cost-reduction by reducing the maintenance.
Build
Effective Error Handling Strategy
The error handling needs to have a
definite strategy, reasons for every action behind it. For example, the error
handling needs to be built considering the below basic error handling
principles or policies:
- Support for retrying the failed messages that are valid.
- Support for Notification and Alerts.
- Deactivating the Service events during failure of dependent services.
- Persisting/Storing the Failed Messages for future Manual Correction.
- Logging the Error and the Related Events.
Build
High Performance Architecture
The power of SOA is to be able to scale
the performance if needed. The reason
for centralizing the error handling and making it consistent is to make the
system not getting affected due to number of errors.
The error handling strategy should focus
on the performance and not support for the user to correct the error. The
asynchronous messaging pattern can be introduced, optionally, so that these
errors are handled without affecting the actual service performance. The asynchronous way of error handling is
building the services such that they are processed offline without making the
calling services wait for the error handling logic to be completed.
Build
Monitoring Process around Error Handling Strategy
Now that we have the strategy in place to
gather and process all the details about the errors occurred. Also it is now important to build a
monitoring process to gain the business benefits out of it.
For example, the notification service and
event service can be effectively utilized to notify the loss of significant
business due to communication error or other unexpected event. For example, the loss of sales above 5000$
can be notified to the CxO officials when such combination occurs or when the
business event occurs worth 10000$ which resulted in error due to unexpected
error such as network outage or password expiry. Building such a design to
recover the selected errors would often result in favour of the business which
otherwise is possible with the dedicated infrastructures such as BHM or
Complex-Event processing.
Conclusion
The purpose of this blog is to introduce
these frequently followed principles of error handling in SOA. However, the
choice of the combination of the suggested strategies depends upon different
factors involved in the enterprise such as available SOA infrastructure, cost
and time, roadmap and the development plan.
So, the success lies in the right choice of
these design aspects at the right time for the enterprise to be able to scale
in future.
No comments:
Post a Comment