We all know that 3rd party C2C(cloud-to-cloud) devices don’t work properly on SmartThings.
This needs to be fixed.
1. Technologies of C2C
First, let’s review underlying technology of the C2C, especially SmartThings Schema.
When establishing connection between ST cloud and 3rd party server, two separate OAuth tokens are created.
[token1] ST cloud → 3rd party server
[token2] 3rd party server → ST cloud
Let’s make an example of a Wall Switch which is connected with 3rd party server.
[token1] is used when we send command to the device.
e.g. Pressing on/off from the SmartThings app, which requires the command to be sent to the 3rd party server to send the ‘on command’ to the switch.
[token2] is used to make a callback to ST cloud when the device status gets changed.
e.g. When pressing physical button on the wall switch, which evntually changes status of the device, then this change should be sent to the ST cloud by callback.
2. What makes C2C so unstable?
Both side of the server (ST cloud and 3rd party server) needs to secure OAuth token, and OAuth server on both side should respond to OAuth refresh request correctly.
But this is not happening in the real world situation of SmartThings C2C schema.
Actually, sending commands to C2C devices is relatively okay [token1], but most of the problems come from status callback [token2].
Since, discovery and state refresh request is called every 24 hours, when then [token1] is broken, the problem could be easily detected by SmartThings cloud at least in 24 hours.
24 hours is such a LONG time for users. but at least, it can be detected thereafter though.
Then, push notificiation is sent to user from SmartThings app, and the user is forced to undergo login process to the 3rd pary account, then both [token1] and [token2] get fixed.
…
However, when the [token2](=callback token) is broken (whether it got lost by an error of the 3rd party server, or ST cloud server failed to respond refresh request of the [token2]), it cannot be easily detected by ST cloud.
In this situation,
- wrong device status shown in ST app.
- The worse problem is, when a user makes automation which is triggered by C2C device, this automation won’t work.
- The worst problem is, there’s NO WAY to fix [token2] error. The only solution is to REMOVE this C2C connection and RECREATE the C2C connection. Then the device is removed from all the automations and scenes that are related with the device.
3. Google’s case
Now, let’s see how other companies do.
We all know that, Google Home integrates the most IoT devices, and works stable, even if it intensively relies on C2C.
Now, let’s see inside of Google Home.
Google Home allow use FIXED token for the callback.
Below is the screen capture of Google’s Home Graph(Google Home C2C IoT) Service Account Key settings with my HomeAssistant server.
They know fixed key could cause security problem, so Google says they crawl the web for the possible leak of the fixed token, and if so, then they disable the token.
Google also provides something called “Workload Identity Fedration”.
It’s a short lived token, like ST uses, but it uses Workload Identity Pool Providers from services like AWS, Azure etc, which are reliable key maintainers.
Google KNOWS that those small IoT companies are NOT capable to maintain OAuth callback token healthy.
Anyway, for now, Google BOTH allows FIXED token and Workload Identity Fedration.
Which means, they still allow using fixed token for status callback(=[token2]).
Thats because even if Google knows about the security risk, but they know better about poor capabilities of their small partners, and more, they CARE FOR USERS’ EXPERIENCES.
4. My suggestions
Past several years, users of SmartThings C2C got pissed off by the errors, especially from the errors of callback token [token2].
Best solution is to keep the OAuth server healthy on both side (ST[token1] and 3rd party[token2]), BUT in real world situation, it seems impossible to acheive this.
I suggest followings.
- ST app should allow users to FORCE re-login to the partner’s server even if it seems OK from SmartThings side.
For now, user can’t re-login to the partner server unless it shows ‘disconnected’ in the ST app.
As I mentioned above, even if it looks okay from the SmartThings side, it may be broken in reality.
And the only way to fix this is to DELETE and RE-LOGIN, but this removes all the device names, automation, scenes etc… and after re-login, user should reset all those things. VERY BAD USER EXPERIENCES.
If ST app allows users to FORCE re-login, then user could fix C2C problem more easily without loosing all their settings.
This is the temporary solution of the problem, but this can be applied IMMEDIATELY, and make users experience better.
- Do polling frequently and detect error
ST should not rely on callback from partners. We all know that this is not reliable at all.
Polling interval should be much shorter than once in 24 hours. As far as I know, Google Home polls parters even more frequently than SmartThings do on their C2C parterns.
Also, ST could detect whether commanding system is broken by receiving errors while polling, and detect whether callback system is broken by discrepancies between polling status and callback status.
If ST detects error in either side, ST should mark that C2C connection as ‘disconnected’ and notify the user to re-login.
- Make change for the callback token
Fundamental solution is to make the callback system robust.
Long Lived access token could be one solution. By setting expiration date longer for the callback token [token2], ST and partners doesn’t need to fix the codes - backward compatiblity.
I know this can be security concern for ST, but USERS CAN’T TOELRATE MORE WITH THIS UNSTABLE C2C SYSTEM.
Also ST can make use of Google’s solution for short lived token.
I hope SmartThings have the WILL to fix unstable C2C devices.