Friday, December 01, 2023

Oracle Integration: How to replace a Connection by one with a different role?

In this blog article I describe how to replace an OIC Connection by one with a different Role (e.g. Trigger and Invoke by Trigger), which “officially” is not supported.
 

An issue I quite often encounter is that OIC developers made mistakes with the Role of the Connections they create. A bit unfortunate, but the default Role is Trigger and Invoke, which in 90% of the cases is the wrong choice to make. It should be either Trigger or Invoke except the few cases where you not only want to actively invoke API’s or services of an application but want to be triggered by “event” from that application. Connections to SaaS applications are a typical example of this, when you want to use API’s and services and also want to subscribe to event. 

The argument for making the Role either as Trigger or Invoke is separation of concerns regarding security settings. At some point you probably must configure security in such a way that for the Integration's trigger (that for REST often started as Basic Authentication or OAuth 2.0 Or Basic Authentication) only OAuth 2.0 is allowed, if not on the OIC Development instance then at least on OIC Acceptance and Production instances. For an Integration's invokes you will have to configure the connections to comply with the security as required for the target applications. For Oracle SaaS applications this often implies using a Confidential Application with Client Credentials.

After having developed some 100 Integrations or more using a Connection with the wrong Role, it's not funny to find out that you cannot simply fix this issue by replacing the Connection by one with the proper Role, as in Integration Designer replacing a Connection by one with a different Role is not supported.

What is also not funny is when you started to use a Trigger and Invoke connection that was created using Basic Authentication for several integrations and then someone decided to change the security configuration to OAuth 2.0, for example like this:


When you try to create a new Integration using this as a Trigger, you will find you cannot do so


So, what might have looked like valid change, strictly speaking corrupted all integrations using this as a Trigger. I have not tried it out, but wonder if you will be able to reactivate them and what would happen when you migrate to Gen3.

So how to get out of this situation? The following describes how you can work around this restriction of not being able to replace a connection by one using a different Role. 

In short, the trick is (with many thanks to my colleague Marc Smeenge who actually discovered this):

  • Temporarily create new Connections with (still) the wrong Role
  • Change the Integration to use those new ones
  • Export the integrations
  • Replace the wrong Connections by new ones with same identifiers but this time the proper Role
  • Import the Integrations
  • Replace the Connections by the proper ones

The following describes this in detail.


Setup

The setup is as follows:

I have 2 Connections that are configured wrongly:

  • SMP_OIC_REST_Trigger_Invoke Connection that is used for all Integrations with a trigger based on the REST Adapter, which should have been Trigger only
  • SMP_ERP_REST_Trigger_Invoke Connection that is used to call Fusion ERP REST API's, which should have been Invoke only



 I have 2 Connections that are properly configured:

  • SMP OIC REST Trigger
  • SMP ERP REST Invoke

I have 2 integrations that both use the wrongly configured Connections:

  • SMP Purchase Orders
  • SMP Accounts

As you can see in the following picture, I'm not able to replace the SMP_OIC_REST_Trigger_Invoke by the SMP OIC REST Invoke, as replacing Connections with different Roles is not supported:


Now for the sake of example, let's assume I want to do the replacement for the SMP Purchase Orders integration only, as the other one is not mine to fix.

The recipe to handle this is as follows:

1. Put all integrations to fix in package (putting integration best practice anyway). In my example I have put the SMP Purchase Orders Integration in a samples package.

2. Clone the wrong connections, for example to ones that are post-fixed by TEMP. These have still the wrong Role and you can give them any security policy as long as they are configured before you move on. In my example I have created a SMP_OIC_REST_TEMP and SMP_ERP_REST_TEMP connections.


 

3. Create new 2.0 version of all Integrations to fix and put them in a package of their own. In my example I have put the 2.0 version of SMP Purchase Orders to package name package.fixing.

 



4. Replace the wrong connections by their clones. In my example I have replaced 

  • SMP_OIC_REST_Trigger_Invoke by SMP_OIC_REST_TEMP 
  • SMP_ERP_REST_Trigger_Invoke by SMP_ERP_REST_TEMP
 

5. Export the package with the 2.0 versions. In my example that is the sample.fixing package.
6. Delete the 2.0 versions of the Integrations to fix.
7. Delete and recreate the cloned connections 1 by 1 with the proper Role. In my example I now have:

  • SMP_OIC_REST_TEMP with Role Trigger
  • SMP_ERP_REST_TEMP with Role Invoke.

8. Import the package. In my case this is the samples.fixing package. As you can see this went without issues:




9. Replace the cloned, wrong Connections the by proper ones. I can now replace:

  • SMP_OIC_REST_TEMP with Role Trigger by the proper SMP OIC REST Trigger 
  • SMP_ERP_REST_TEMP with Role Invoke by the SMP ERP REST Invoke.

 


 



10. Delete the 1.0 versions of the Integrations to fix. In my case that is only the SMP Purchase Orders 1.0 version.
11. Version the 2.0 Integrations back to to 1.0 ones with the original package name. In now have a SMP Purchase Orders 1.0 version in the samples package.

As you can see in the following picture, the sample package now uses both the old, wrong Connections as I did not replace it for the SMP Accounts Integration, but also the new, proper Connections:


Problem fixed!

Friday, August 18, 2023

OIC: Reverse Versioning Integrations

In this article I introduce a versioning strategy where – probably very unlike you are used to - the integration with the smallest version number is the latest and greatest, making it clear to everyone what the state of development is, and preventing OIC from getting polluted with old versions nobody knows what to do with them anymore.

A new integration typically is created on an OIC DEV environment (instance) as version 1.0.0. Then at some point most people create a version 1.0.1, 1.1.0 etc. It is not uncommon to have a version like 3.4.2 before the first one is promoted to the next OIC environment. Very often many previous versions are still there on the DEV environment and never cleaned up.

This often if not always introduces several challenges:

  • While the integration is still being worked but the developer cannot be reached, it is unclear what the latest and greatest version is. You should not just assume it is the one with the highest version number, as that might be just some interim version that will get thrown away. Also, different developers may use different versioning practices if at all.
  • When the developer is away for long or has left the project, it can be quite a challenge to determine what can be cleaned up without risking losing useful backups.
  • Even for the developer it may not be clear what is what anymore, with a high risk of losing work (been there …).
  • When source control is used, the repository tends to get polluted with multiple versions of which most are obsolete (instead of keeping on versioning the same).

I have seen customers with over hundreds of integrations on their DEV instance with around a third of them not being activated an some more than 2 years old. Good luck with cleaning that up!

All this can be prevented by practicing a very simple versioning strategy, which I call “reverse versioning”. 

The few, simple steps for using it are the following:

  1. Start your first version as 1.0.0,
  2. Work on it until you have developed it to some state you don’t want to lose,
  3. If you are using a source control system like Git, commit this 1.0.0 version to Git,
  4. Delete any existing, older 1.0.1 version (see next step),
  5. Create a new version from the current 1.0.0 as 1.0.1,
  6. If it is used by other components or people, activate the 1.0.1 version. At this point in time this is the last-known-good version,
  7. Now here it comes: return to step 2 by continuing development on the 1.0.0 version (instead of the 1.0.1 or any other next version).

You create the 1.0.1 version for 2 reasons:

  • As a convenience so that you can easily revert to the last-known-good version if you want to undo your latest changes.
  • To have something activated when the integration is being used.

If step 3 above is applicable to you then you can also retrieve it from Git but restoring a backup from Integration Composer is easier of course. If you don’t have source control in place (poor you!) then you could consider creating a few extra backups like 1.0.2 but be very restrictive with that.

Now what if some next time you are supposed to increase the version number, for example because the previous one is already in production? In that case the version used in step 1 above will be that new version and when you use version control you can delete the previous one from DEV.

All this has the following advantages:

  • For every developer it is clear what the status of an integration is:
    • If there is a 1.0.1 version next to a 1.0.0 one, then you know it is under development.
    • If there is no 1.0.1 version, then the 1.0.0 is the latest and greatest.
  • When there is an inactive 1.0.1 version while the 1.0.0 has been activated more than a working day ago, you can safely throw it away. After all, that should only exist when the integration is under development.
  • When you use a version control system you only commit the 1.0.0 version, so no redundant versions in there.
  • When promoting a new release to the next environment, it is clear which versions to use (the ones with the smallest version number). No need to start asking around.
  • It supports a consist way of using semantic versioning (https://semver.org/)
  • No surprise for Applications and Systems Administrators regarding unexpected gaps in version numbers.

I have introduced this practice on all my latest projects, and from experience I can tell you reversion versioning it easily picked up by any developer, and once in place I have not had a single problem with people losing code, not knowing what to promote to the next environment, or the OIC instance getting polluted with obsolete versions. Everybody wins!


Monday, October 17, 2022

OIC: Structured Process Custom Fault Handling Pattern

Fault handling in a Structured Process in Oracle Integration (OIC) is not always trivial, especially not when you have specific requirements for it. This article describes how such a challenge can be approached by means of the Custom Fault Handling pattern.

In the following first the out-of-the-box fault handling using the fault policy is described. Some arguments are given why this might not properly tackle your situation, but disabling it is also not the proper option. After that it is described how implementing custom fault handling can provide a good alternative.

Handling with Default Fault Policy

When activating a Structured Process in OIC the default option is Use Fault Policies with the Default checkbox checked.


This implies that Structured Process handles a fault with invoking an Integration (or Web Service or REST API, in the remainder all referred to as “service”) by doing 2 retries in a row with exponential back-off (1st retry after 5 seconds, 2nd retry 10 seconds after the 1st retry).


 

 This is probably not what you want, for one or more of the following reasons:

  • The cause of a fault probably makes these automatic retries not useful, as it is unlikely to succeed on such a short notice. Think about the situation where a service is temporarily not available, the credentials or authorization are not properly configured, or there is something wrong with the configuration of the target application.
  • In case of a service that is not available, retries on a short notice may add insult to injury when the root cause is an overload of service.
  • In case of wrong credentials, the collateral damage could be that the used account gets locked out.
  • When the services creates or updates data in a SaaS application it the call probably is not idempotent (that is cannot be called more than once with the same result), which makes any retry before investigation “dangerous”.

In practically all cases I therefore ended up with activating the Process Application with fault policies turned off which – considering it to be the default - is easy enough to forget, by the way.

Handling Without Default Fault Policy

With fault policies turned off the process stops after running into a fault. The fault can then be investigated, and handled from the Workspace using one of 4 options:

  • Abort: this aborts the process instance and all associated instances (like a parent process in the same application which started the culprit instance).
  • Retry: this retries the invoke to the service.
  • Alter Flow & Suspend: this gives the administrator the option to move the token of the instance somewhere else in the flow, for example to some previous activity (which would then make the service call happening again) or some later activity while ignoring the fault. 
  • Cancel: this aborts the culprit process instance but will leave all other related instances running.

This article is not to guide you on how to use any of these options (for that you are referred to the section Monitor and Adjust Process of the online documentation). What I do want to point out though, is that this way of handling faults might also not be what you are looking for, for one or more of the following reasons:

  • It does not support advanced fault policies, like:
    • Retries after minutes or hours, with the last one outside the time-window after which it should work as per its SLA.
    • Different ways of handling, depending on the service or nature of the fault. For example, in case of a timeout an automatic retry may make sense but not in case of a security or data issue.
  • Although advanced enough to handle practically any fault (you can not only move the token but also change the payload of the instance), Alter Flow might be too advanced for the administrator, especially when not proficient with BPMN.
  • It does not support involving a business user, which might be needed in case of a data issue. For example, think of a situation where submitting an order fails because the customer has not yet been validated by the business.

When the before is applicable to your situation you probably have a need for custom fault handling. The following describes how that can look like.

Custom Fault Handling

The core of the custom fault handling solution is a 3-step process:

  1. The process catches the fault and forwards it to a generic fault handling process.
  2. In the fault handling process a fault policy is applied or the fault is handled by an administrator or business user. The outcome of the fault handling process is an action with a value which is one of “abort”, “retry”, or “continue”, which is passed back to the process instance having the issue.
  3. The process instance acts upon the returned action. What that is, depends on the nature of the process but typically is one of:
    1. Stop processing (abort)
    2. Retry the service call (retry)
    3. Ignore the fault and move on (continue)

The following picture shows how the process instance catches the error and forwards it to the fault handling process using a Send activity. After the Send the fault handling process determines what should happen next, which is received by the process instance using a Receive activity.

This process is implemented as a Reusable Subprocess which is invoked by a parent process using a Call activity. The pattern for invoking a service and catching the errors is the same for each service, except for the type of errors to catch and the flows coming out of the chosen action? gateway. The first depends on the nature of the service (and for example is different for SOAP services than for REST API’s) while the applicable flows depend on the viable options in the context of the parent process.

The service call has 2 Boundary Error events, one for the BindingFault and one for the RemoteFault, It also has a Boundary Timer event to catch a timeout. You could use a mechanism that sets a configurable amount of time to the timer, depending on the service provider.

In the Map Fault Handling Request activity sufficient information is mapped to the request of the fault handling process so that it can apply the proper fault policy or help the administrator or end user to understand the nature of the fault. With the Start Fault Handling the fault handling process is started while the errored process instances waits for what to do next in the Receive Resolution Action activity.

In case of “retry”, a retry counter is increased so that a next time the fault handling process “knows” how many times the service has already been invoked. In case of “abort” an Error End event is thrown, which must be captured by the parent process. 

The following picture shows how the generic fault handling process could look like.



The Apply Faut Handling Policies activity is a call to some business rule which determines what the fault handling process should do next:

  • Return “retry” to the process instance after a (configurable) amount of time.
  • Forward the fault to a business user by means of a Human Task
  • Send the fault to an administrator by means of a Human Task. The administrator can also forward the fault to a business user.

The fault policy can be implemented by a Decision Service or some other component which behaves like a business rule engine. That means that the component does not act upon the fault itself, but instead returns some outcome which determines the next step of the fault handling process. 

The following gives an example of the implementation of a fault policy using the OIC Decision Service (many thanks to Marcel van der Glind who inspired me with his example).

As you can see the action which determines the next step is depending on the name and version of the service, the fault code and the number of retries already done. In this example you can see that in case of a fault code with value 500, the Faulting Service version 1.0 should be retried after 5 and then after 10 seconds (for the sake of example and due to my lack of my patience, the same as the default fault policy of OIC itself) and - after the second attempt failed - then forwarded to the administrator. In case of a fault code other than 500 it should go straight to the administrator, as is the case for any other service or version. 

After automatic or manual determination of the action, the fault handling process returns that back to the culprit process instance by means of the End event, which is a callback to the afore mentioned Receive Resolution Action Receive activity.

As explained, the fault handling process is generic and therefore implemented in a Process Application of its own. To minimize the risk of an exception happening in the exception handler itself, it should be kept as simple as possible and not unnecessarily never depend on any other component which on its turn might fail.

Assuming that the administrator will be given access to the Workspace, you should consider using an out-of-the-box web form to implement it. Web forms are somewhat limited so you cannot implement logic that would hide buttons for actions that are not applicable. Hence the valid action? loop back in the flow. As far as the Human Task for business users is concerned, you probably need to implement that using a more advance UI technology like Visual Builder.