Kickstart in CI environment unpredictable
-
We are using FusionAuth in a greenfield product. One of the main reasons we are using FustionAuth is the ability to use an isolated version in CI and development. Using the Kickstart file here is important, as it allows us to easily test new configurations.
However, we have noticed our test starting to fail more and more due to Fusionauth not being configured. This is related to the size of the kickstart itself. (eg, introducing Themes in the kickstart file completely break the end to end tests). This failure is always related to the client/application not being configured.
Fusionauth becomes available before the Kickstart process is completed. Worse, it remains available after the Kickstart process failed. This basically means we don't know if/when it is ready to be used int the tests.
We thought about fixing this by polling for the oauth configuration endpoint before running the tests. This works, sometimes, but it seems that this polling overloads the server and the Kickstart process fails with timeouts.
Is there any best practice we can consider with the kickstart file, how to know if and when it completed? Our only other option we see is to dump the Database after locally running the kickstart process, and use that as the basis for our end to end tests, but that would bring an entire other set of problems with it.
Would appreciate any help here.
-
Hmmm.. Sorry, this sounds frustrating. A couple of things:
- If kickstart is failing/choking because of the size of the file, we want to know about that. Can you please file a bug (preferably with a reproduction steps/kickstart files that cause failure)? https://github.com/fusionauth/fusionauth-issues/issues
- We'd also love to know about the polling failures. Are they replicable? Are you pulling the entire application object?
- It sounds like you'd like a webhook or some other async notification when kickstart is done. Is that right? If that is the case, can you file a feature request: https://github.com/fusionauth/fusionauth-issues/issues There is no way to do that right now, but I think that'd be a dev friendly feature.
Finally to deal with the proximate issue, I would see if you could poll for something lighter weight. Perhaps create a no op lambda at the very end and test to see if that exists?
The other thing is I wonder if FusionAuth is resource starved in CI? What kind of hardware is the CI process running on? Is FusionAuth throwing any errors in the system log while CI is going on? How much memory does it have?
-
I don't think it is the file size as such. We have a flickering CI run now, where sometimes everything is fine, and other times it seems the kickstart didn't run at all. (at this point we have not collected the logs from the CI server, but that something I will only be able to get to by easiest next week.
We have seen similar issues locally, but not something that we can repeat 100% of the time.
When polling we poll for something like this:
https://DOMAIN/api/application/UUID/oauth-configuration
Because that indicates if the essential parts of the kickstart file actually ran AND can be run without any auth information. We do this for up to 60 times, with a 1 second network timeout.
Depending on how Fusionauth handles client interrupted connections, it could be that we are flooding it that way?In regards to the web hook or async notification on kickstart finalisation. That would not be the easiest to use. We use Cypress, and it would be difficult to tie that together.
The best experience would be to have Fusionauth NOT accept incoming request until the Kickstart sequence is done, and fail to boot entirely if the kickstart failed.
I can imagine that this is probably asking a lot, so a compromise would be some lightweight endpoint where we can ask the Kickstart status: some http endpoint that replies withwill_not_kickstart
(eg when there is already a config),waiting_to_kickstart
,kickstart_running
,kickstart_success
,kickstart_failed
.Polling for a lighter weight object would be possible. I'm not the biggest fan of the lambda check as it requires the CI runner to know the API access token. Anything else you could suggest?
We are running this in Github CI, inside a docker-compose setup. Again, we don't have the logs yet. This will take us a week or so to get.
I'll get back here as soon as I have more info.
-
@bert-goethals said in Kickstart in CI environment unpredictable:
The best experience would be to have Fusionauth NOT accept incoming request until the Kickstart sequence is done, and fail to boot entirely if the kickstart failed.
That seems like a reasonable feature request, as we really want to make Kickstart as CI/CD friendly as possible.
Can you please file a feature request here: https://github.com/fusionauth/fusionauth-issues/issues
Please reference this forum post and include your compromise about a lightweight endpoint.
Polling for a lighter weight object would be possible. I'm not the biggest fan of the lambda check as it requires the CI runner to know the API access token. Anything else you could suggest?
You could create an API key that is limited to only be able to read the lambda endpoint? Would that alleviate your security concerns?
There are other objects that are going to be lightweight and that you could again, create a limited API key for (user actions, consents, form fields), but nothing springs to mind that doesn't require an API key. In fact, I'm kinda surprised that the
oauth-configuration
is available (I don't see that being an open endpoint here: https://fusionauth.io/docs/v1/tech/apis/applications/#retrieve-an-application ).Will be very interested to see what your logs say around resource starvation. Please also include how much memory is available for the CI image and what your FusionAuth memory configuration is.
-
There is now a webhook for kickstart completion available in 1.30.0
https://github.com/FusionAuth/fusionauth-issues/issues/1178 is the tracking issue.
-
-
@bert-goethals Can you please explain how you have used the webhook in your Github CI action?
I've got the same problem happening and I'm not even sure what to point the new webhook to?
I've posted this question if you could please assist me in any way at all, it would highly appreciated: https://fusionauth.io/community/forum/topic/1525/how-should-i-be-using-the-kickstart-success-webhook