Delogue was experiencing technical issues, rendering the platform unavailable from time to time.
Among other, the timeout to the platform was increased in December due to some suppliers not being able to download large files as the download would time out.
This resulted in many open connections at once.
We are now trying to ensure that connections are regularly closed down and that no one connection will run for too long - without impacting the usability of the platform.
We discovered that the connection pool limit was being reached. A fix was scheduled overnight to avoid additional downtime.
The the connection pool limit was raised by 1000 %.
Unfortunately the issues persist.
Additional monitoring was introduced and API call logging scheduled.
Additional logging for API calls was introduced to monitor connections.
Additional increase of the connection pool was planned.
Called in external consulting experts to help analyse the issues.
Several corrective actions have been taken:
- We have identified places where queries were not closing the connection correctly
- The underlying code has been changed to handle connections better
- We have increased the connection pool by five times
We unfortunately continue to have issues with stability and the team is working on finding the root cause.
We have not experienced issues with the platform at the 'usual' time (10 am CET).
Around 12 noon we have a short 1 minute restart.
We've identified scheduled jobs that may cause this.
The team continue to analyse the data and logging from the past days instability to ensure we get the the root cause of this.
We continue to have our main focus on this and will keep updating this article as we know more.
Updated 26.01.23, 13:50 CET