An interesting edge-case came up with our Stripe application that we wanted to share with the community.
While building out the webhook functionality for AccountDock, we stumbled onto a bug that was causing our webhook-code to be triggered twice.
This stood out to us because we're already accounting for replay-attacks
We also advise you to guard against replay-attacks by recording which events you receive, and never processing events twice.
We do this through a relatively trivial process:
What happens when a Stripe account has connected to both our development and our production applications?
eg. They connected to our Stripe development application while they were building their platform, and then once they went live with payments, connected to our production application to give us access to their live transactions?
Here's a visual of what they would see in their App Settings page if they did so:
Well, here's what happens (as our server see's it):
Our servers are set up to receive webhooks from both development and production application connections to one endpoint (eg. https://accountdock.com/webhooks)
(This may be unique to us, so I'll come back to this later as a pre-condition to this edge-case)
Ordinarily what might happen is the 2 webhooks would be sequential, causing the second one to fail because of our checking for replay-attacks
However, and this is a big however:
What happens if Stripe calls these webhooks so close, they're effectively being made in parallel?
If the requests where made at even moderately different times (eg. 10+ milliseconds), the following would happen:
This would all happen before the next webhook was received, which would result in that webhook failing since it would be perceived as a replay-attack
However, that didn't happen.
We've tested webhooks hundreds of times
But this one time, something different happened
The webhooks were received so closely together that the insert query for tracking the first (unique) event took long enough that our server thought the 2nd webhook was also unique (because it was received while the insert was still happening)
Our insert queries are fast but these webhooks were fired even faster (which could make sense: I'm sure Stripe has a robust queue system to handle outbound webhooks)
This caused the controller logic to fire twice, and to have the event logged twice in our database:
The application-level implications of this were fairly manageable:
A receipt was sent twice to a customer
However, we were lucky
If we were expecting truly unique webhooks to be processed for something more vital (eg. issuing a charge after a credit card was updated, or customer record created), this could have resulted in multiple charges that weren't immediately noticed.
This bug is an edge case for three reasons:
Despite these three cases, we were hit by it
The solution was actually pretty simple:
By locking your table for what will likely be less than 50ms, you ensure that any lookups on that table in the interim will be held until the lock has been released
As mentioned, this is an edge case, and AccountDock was the perfect storm for it. However we'd advise anyone using Stripe to power their payments to take note of it (and even more so, those building Stripe applications).
The implications could likely be pretty damaging if this case is not accounted for, and considering the simplicity of the fix (eg. a WRITE lock), we thinks it's worth it.