hasura / graphql-engine Public
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: server integration test suite improvements #6951
Comments
|
Suggestion for a new problem + solution:
|
|
Thanks for putting this together Abby!
We could in theory hack something together with haskell code coverage, for each integration test class/module store a code coverage report, then (waves hands) combine that with the local git diff...
The process linked for adding a new backend doesn't sound too bad to me, but I probably have low expectations... but I definitely agree with the goal of encouraging thorough testing against new backends.
One thing I've noticed is some sets of tests within a class are stateful, so if an earlier test fails the later ones will too. The appropriate thing is I think to just make them one big test (commenting the different phases).
Ya it's annoying to match setup/teardown specs to tests... I think an improvement would be to have every test class have its own set of setup/teardown files, following some obvious naming convention, even if that meant duplicating some of these files |
|
Thanks so much for these suggestions @jberryman, I've included them in the description above.
I hope Evie doesn't mind me linking to her PR as an example to illustrate that the process in practice turns out to be much more clunky. The request format for Combined with unclear test logs, testing new backends in bulk can be difficult due to errors or test failures that might be caused by SQL syntax, |
|
For "reasoning about test state and coverage":
How about unifying the way we do setup/teardown such that there is a single way to do so? The syntax might become slightly more complex, but the advantage is that it'll always be clear how to do it, and there's no more redundancies. Additionally, I think adding new tests should only rely on creating new files. It's really easy to forget to update the Python classes to add the appropriate method and get PRs accepted. |
|
I've created separate issues for each of the proposed goals listed above; we'll aim to prioritise these in the server team sprints in the next few weeks, but please consider these individual issues open to feedback, discussion and contribution. Otherwise closing this RFC with the plan to keep the individual issues updated. |
Proposed problems and solutions
🙏🏽 But first, a grateful acknowledgement of past & ongoing contributions to an extensive integration test suite and significant CI build & test run time improvements.
it currently takes 15+ mins on a beefy desktop to run the postgres integration test suite; we can expect significant slowdown as we add backends. Coupled with hour-long CI build times (in P95) this is prohibitive to testing and integrating changes quickly.
suggested goal:
reduce local integration test run time from ~15mins to [insert suggestions here. A quick Google indicates that a reasonable run time is 3-5 minutes].
suggested improvements
adding test support for a new backend (e.g. SQL Server, BigQuery) is currently a 4 or 5-step process involving a lot of code and yaml-file duplication, with many minor, manual and error-prone tweaks (see linked examples). This is likely to discourage extensive testing and community contribution.
suggested goal: eliminate/minimise duplication of code and test files for backend-agnostic feature tests and optimise workflow for backend-specific feature testing.
suggested improvements
our pytest output has a pretty low signal:noise ratio, and it's often difficult to understand the cause of a test failure, if there is one. Example: a 8600-line log for what turned out to be a flaky test that passed on a second run.
suggested improvements
identifying which tests are actually run and which setup/teardown files support them seems to be a manual lookup of test files to string filepaths. Example. This contributes to cargo-culting redundant files and makes refactoring/unused file cleanup difficult.
goal: unsure, to be honest. Is it enough to find and remove dead code regularly, so that we have more confidence that the existing files are relevant? In which case...
suggested improvements
Request for comments
What happens next
I'll leave this open for at least a few weeks to gather feedback and suggestions. However, the team's quite keen to address these kinds of ongoing improvements, so will likely prioritise some of them in our upcoming sprints and update this issue accordingly.
The text was updated successfully, but these errors were encountered: