Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

findById Optimization #35261

Merged
merged 25 commits into from
Jun 9, 2023
Merged

findById Optimization #35261

merged 25 commits into from
Jun 9, 2023

Conversation

TheovanKraay
Copy link
Member

@TheovanKraay TheovanKraay commented Jun 1, 2023

Description

The current default behaviour of findById(ID id) is that it will always execute a cross partition key query. This is not an optimal behaviour for containers with big datasets/large numbers of partitions. This change is an optimization to make findById(ID id) execute a point read when partitionKey is id, and log a warning when it is a field other than id. The new default behaviour is almost certain to be what user intends. Current behaviour can potentially cause significant unintended latency issues.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@saragluna
Copy link
Member

saragluna commented Jun 2, 2023

@TheovanKraay will this change be ported to the spring boot 3 branch?

@TheovanKraay
Copy link
Member Author

TheovanKraay commented Jun 2, 2023

@TheovanKraay will this change be ported to the spring boot 3 branch?

Oh I had not intended to as I expected this to be merged earlier as it’s only a small change. Can you pull from main when merged? The changes are very small. Or will Spring Boot 3 branch be merged very soon?

@azure-sdk
Copy link
Collaborator

API change check

API changes are not detected in this pull request.

@TheovanKraay
Copy link
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -351,6 +351,15 @@ public <T> T findById(Object id, Class<T> domainType, PartitionKey partitionKey)
public <T> T findById(String containerName, Object id, Class<T> domainType) {
Assert.hasText(containerName, "containerName should not be null, empty or only whitespaces");
Assert.notNull(domainType, "domainType should not be null");
CosmosContainerProperties containerProperties = getContainerProperties(containerName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this add another round trip to the server for every findById? Can we cache this somewhere?

Copy link
Member Author

@TheovanKraay TheovanKraay Jun 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh - yes, of course - fixed :-)

@TheovanKraay
Copy link
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -5,6 +5,7 @@
#### Features Added

#### Breaking Changes
* Optimization for `findById(ID id)` so that it will execute point reads where id is also the partition key, and log a warning where it is not. Technically a breaking change, but new behaviour is more optimal, especially for large containers with many partitions - see [PR 35261](https://github.com/Azure/azure-sdk-for-java/pull/35261).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this still technically a breaking change? I don't think it is anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it would be breaking if behaviour changes. I’m not sure on what we define as “breaking”. Happy to change this.

return findById(id, domainType, new PartitionKey(id));
}
if (!this.pointReadWarningLogged) {
LOGGER.warn("The partitionKey is not id!! Consider using findById(ID id, PartitionKey partitionKey) instead. See https://aka.ms/PointReadsInSpring for more info.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the error message here to inlcude at least 1 sentence for the motivation like.

The partitionKey is not id!! Consider using findById(ID id, PartitionKey partitionKey) instead to avoid the need for using a cross partition query which results in higher latency and cost than necessary. See https://aka.ms/PointReadsInSpring for more info.

Copy link
Member

@FabianMeiswinkel FabianMeiswinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for few minor comments about log message and changelog text

@TheovanKraay
Copy link
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@TheovanKraay
Copy link
Member Author

/check-enforcer override

@kushagraThapar
Copy link
Member

/azp run java - spring - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@TheovanKraay
Copy link
Member Author

/check-enforcer override

BASIC_ITEM.getId(), BasicItem.class);
assertEquals(result, BASIC_ITEM);
assertThat(responseDiagnosticsTestUtils.getCosmosDiagnostics()).isNotNull();
assertThat(responseDiagnosticsTestUtils.getCosmosResponseStatistics()).isNull();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also add this line to make sure it is a point read ->

assertThat(responseDiagnosticsTestUtils.getCosmosDiagnostics().toString().contains("\"requestOperationType\":\"Read\"")).isTrue();

.consumeNextWith(actual -> Assert.assertEquals(actual, BASIC_ITEM))
.verifyComplete();
assertThat(responseDiagnosticsTestUtils.getCosmosDiagnostics()).isNotNull();
assertThat(responseDiagnosticsTestUtils.getCosmosResponseStatistics()).isNull();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - assertThat(responseDiagnosticsTestUtils.getCosmosDiagnostics().toString().contains("\"requestOperationType\":\"Read\"")).isTrue();

public void testFindById() {
// test findById (ID id) cross partition
final Address result = repository.findById(TEST_ADDRESS1_PARTITION1.getPostalCode()).get();
assertThat(responseDiagnosticsTestUtils.getCosmosResponseStatistics()).isNotNull();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should check for the operation type is query. Similar to above.

@TheovanKraay
Copy link
Member Author

/azp run java - spring - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@kushagraThapar kushagraThapar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work @TheovanKraay , thank you!!

@TheovanKraay TheovanKraay merged commit 8236c59 into Azure:main Jun 9, 2023
trande4884 pushed a commit that referenced this pull request Jun 9, 2023
* findById Optimization

* reformat

* add hashcode override and licence to BasicItem

* improve point read test

* address linting errors

* fix linting errors

* fix linting

* some enhancements

* fix build errors

* fix testFindByIdNull integration test failure

* add return null for CosmosAccessException

* address Fabian review comment

* update change log

* address review comments

* fix build errors

* fix build errors

* fix build errors

* fix build errors

* add testFindByIdWithInvalidId

* add missing license header

* add repository api checks for point read/query

* change pointReadWarningLogged to primitive

* edit change log and warning text

* add extra checks per Kushagra's comments

* Update CHANGELOG.md

---------

Co-authored-by: Theo van Kraay <thvankra@microsoft.com>
Co-authored-by: Kushagra Thapar <kushuthapar@gmail.com>
trande4884 added a commit that referenced this pull request Jun 9, 2023
trande4884 pushed a commit that referenced this pull request Jun 9, 2023
* findById Optimization

* reformat

* add hashcode override and licence to BasicItem

* improve point read test

* address linting errors

* fix linting errors

* fix linting

* some enhancements

* fix build errors

* fix testFindByIdNull integration test failure

* add return null for CosmosAccessException

* address Fabian review comment

* update change log

* address review comments

* fix build errors

* fix build errors

* fix build errors

* fix build errors

* add testFindByIdWithInvalidId

* add missing license header

* add repository api checks for point read/query

* change pointReadWarningLogged to primitive

* edit change log and warning text

* add extra checks per Kushagra's comments

* Update CHANGELOG.md

---------

Co-authored-by: Theo van Kraay <thvankra@microsoft.com>
Co-authored-by: Kushagra Thapar <kushuthapar@gmail.com>
trande4884 added a commit that referenced this pull request Jun 13, 2023
* findById Optimization (#35261)

* findById Optimization

* reformat

* add hashcode override and licence to BasicItem

* improve point read test

* address linting errors

* fix linting errors

* fix linting

* some enhancements

* fix build errors

* fix testFindByIdNull integration test failure

* add return null for CosmosAccessException

* address Fabian review comment

* update change log

* address review comments

* fix build errors

* fix build errors

* fix build errors

* fix build errors

* add testFindByIdWithInvalidId

* add missing license header

* add repository api checks for point read/query

* change pointReadWarningLogged to primitive

* edit change log and warning text

* add extra checks per Kushagra's comments

* Update CHANGELOG.md

---------

Co-authored-by: Theo van Kraay <thvankra@microsoft.com>
Co-authored-by: Kushagra Thapar <kushuthapar@gmail.com>

* Update sdk/spring/azure-spring-data-cosmos/CHANGELOG.md

Co-authored-by: Kushagra Thapar <kuthapar@microsoft.com>

---------

Co-authored-by: Theo van Kraay <theo.van@microsoft.com>
Co-authored-by: Theo van Kraay <thvankra@microsoft.com>
Co-authored-by: Kushagra Thapar <kushuthapar@gmail.com>
Co-authored-by: Fabian Meiswinkel <fabianm@microsoft.com>
Co-authored-by: Kushagra Thapar <kuthapar@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
azure-spring All azure-spring related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants