Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hole punching support on FreeBSD version 1400032 #12458

Closed
wants to merge 2 commits into from

Conversation

khng300
Copy link
Contributor

@khng300 khng300 commented Aug 5, 2021

Motivation and Context

This change introduces support for hole-punching facilities starting from __FreeBSD_version 1400032.

FreeBSD version 1400029 introduces hole-punching support. This comprises of a userland fspacectl(2) system call and the vn_deallocate(9) public KPI. The support was introduced at https://cgit.freebsd.org/src/diff/share/man/man9/VOP_DEALLOCATE.9?id=0dc332bff200c940edc36c4715b629a2e1e9f9ae.
FreeBSD version 1400030 introduces O_*SYNC support. In case ap->a_ioflag includes IO_SYNC we do a zil_commit with respect to the given znode.
The new semantic in 1400032 updates rmsr.r_offset to be rqsr.r_offset + number of bytes zeroed before EOF.

Description

This revision comprises of a FreeBSD-specific implementation for vop_deallocate.

How Has This Been Tested?

Run the setup on FreeBSD-current, and run all the test cases listed in /usr/tests/sys/file/fspacectl_test -l after changing working directory to a zfs mount. zts's fallocate-punch_hole has also been updated to include support for truncate -d on FreeBSD.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@khng300 khng300 changed the title Add hole punching support on FreeBSD after 1400029 Add hole punching support on FreeBSD version 1400029 Aug 5, 2021
Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it generally looks good, thanks for working on this, have you intentionally not used zfs_space()/zfs_freesp() here? They are actually used to replay log record your code creates with zfs_log_truncate(), and as I see on a quick look they are used by Linux code. Was it to update the remaining offset/length? I actually worry if completion is partial somehow, how will it get logged then.

@khng300
Copy link
Contributor Author

khng300 commented Aug 5, 2021

Was it to update the remaining offset/length?

Yes.

I actually worry if completion is partial somehow, how will it get logged then.

There is currently no mechanism to log a zfs_space() that happens to make progress within the affected range but fails later, because TX_TRUNCATE logging happens only when dmu_free_long_range succeeds. I saw dmu_free_long_range could fail either due to unmounting (EINTR) or IO error in dbuf_hold_impl when walking the indirection tree. For the Linux's fallocate() case that means a fallocate() that fails in the middle of dmu_free_long_range() is not going to be logged.

For the log replaying part, zfs_space() is still called to replay the log record as there were no possible concurrent access during log replay. As long as we put the TX_TRUNCATE record with offset + length <= file size the file won't be accidentally enlarged when zfs_space() is called during replay (zfs_space() always enlarge a file when offset + length is beyond file size).

@amotin
Copy link
Member

amotin commented Aug 5, 2021

Was it to update the remaining offset/length?

Yes.

I wonder what is real-life scenario for returning earlier with length not zero, that would not be an error? Some signal? I wonder whether dmu_free_long_range() is even looking for signals. I've briefly tried to look for it and haven't found. In case of mentioned unmounting and I/O error I am not sure there is a big value to report the progress. If I/O is interrupted by signal, then it would make sense to log the operation as partial.

@khng300
Copy link
Contributor Author

khng300 commented Aug 5, 2021

Was it to update the remaining offset/length?

Yes.

I wonder what is real-life scenario for returning earlier with length not zero, that would not be an error? Some signal? I wonder whether dmu_free_long_range() is even looking for signals. I've briefly tried to look for it and haven't found. In case of mentioned unmounting and I/O error I am not sure there is a big value to report the progress. If I/O is interrupted by signal, then it would make sense to log the operation as partial.

Currently pending signal checking does not happen inside zfs_space(), so yes for zfs's case it would finish as a single iteration within vn_deallocate_impl(). I do agree it is actually tolerable that callers do not need to know the internal remaining length exactly before the freeing fails if it was due to I/O error. In such scenario it is impossible to guarantee anything useful either. For the unmounting case, that remaining length in case any works have been done, can still be a hint to the runtime when the same dataset is mounted again. If the remaining length returned by zfs shows progress, when unmounting aborts dmu_free_long_range(), fspacectl(2) returns 0 and in userspace the callers will see the operation succeeds with return value equals 0, and with a non-zero rmsr.r_len.

However I am actually okay to change them if simply reusing zfs_space() is much favorable than considering the two niche cases.

@amotin
Copy link
Member

amotin commented Aug 5, 2021

However I am actually okay to change them if simply reusing zfs_space() is much favorable than considering the two niche cases.

I am all for the code reuse and minimal divergence between platforms. It simplifies maintenance a lot.

@khng300 khng300 marked this pull request as ready for review August 7, 2021 06:40
@khng300 khng300 requested a review from amotin August 7, 2021 07:29
@khng300
Copy link
Contributor Author

khng300 commented Aug 9, 2021

@amotin The PR has been converted to use VOP_GETATTR and zfs_space instead.

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is much smaller, I like it. I am just thinking whether VOP_GETATTR() to get only a file size is an overkill. Wouldn't zp->z_size (may be with ZFS_ENTER()+ZFS_VERIFY_ZP() as required by comment) give us what we need? I don't see zfs_read()/zfs_write() do anything more for it.

@khng300
Copy link
Contributor Author

khng300 commented Aug 10, 2021

It is much smaller, I like it. I am just thinking whether VOP_GETATTR() to get only a file size is an overkill. Wouldn't zp->z_size (may be with ZFS_ENTER()+ZFS_VERIFY_ZP() as required by comment) give us what we need? I don't see zfs_read()/zfs_write() do anything more for it.

Wouldn't zp->z_size (may be with ZFS_ENTER()+ZFS_VERIFY_ZP() as required by comment) give us what we need? I don't see zfs_read()/zfs_write() do anything more for it.

Yes, so the new version does regular ZFS_ENTER()+ZFS_VERIFY_ZP(), checks read-only and calls zfs_freesp() directly.

EDIT: Note that, the zfs_access check is not there in zfs_deallocate compared to zfs_space(). Neither Solaris nor FreeBSD supports the MADV_REMOVE semantic in madvise(2). MADV_REMOVE is Linux-only.

@khng300 khng300 requested a review from amotin August 10, 2021 07:34
@ghost ghost added the Status: Code Review Needed Ready for review and testing label Aug 10, 2021
@khng300 khng300 requested a review from a user August 10, 2021 11:56
@ghost
Copy link

ghost commented Aug 10, 2021

Let's wait for a new FreeBSD main snapshot build (Thursday) so we can properly test this. The most recent snapshot can't build anything because of locale breakage.

@khng300
Copy link
Contributor Author

khng300 commented Aug 12, 2021

I am going to rebase it onto the upcoming 1400030. This adds support for IO_SYNC cases.

EDIT: Done.

@khng300 khng300 changed the title Add hole punching support on FreeBSD version 1400029 Add hole punching support on FreeBSD version 1400030 Aug 12, 2021
@khng300 khng300 requested a review from amotin August 12, 2021 15:11
@khng300 khng300 changed the title Add hole punching support on FreeBSD version 1400030 Add hole punching support on FreeBSD version 1400029 and 1400030 Aug 12, 2021
@amotin
Copy link
Member

amotin commented Aug 12, 2021

@khng300 , not directly related to this, so can be separate PR, but it would be good to also implement ZIO_TYPE_TRIM in vdev_file.c using these new operations. IIRC we have number of ZFS tests blocked because of this missing piece. @freqlabs should know better which.

@khng300
Copy link
Contributor Author

khng300 commented Aug 25, 2021

Under the new semantic in 1400032 rmsr.r_offset equals rqsr.r_offset + number of bytes zeroed before EOF. So this does not require returning file size at all. (The original intent of making such change is to allow callers simply calculate the number of bytes zeroed by a subtraction, and to also make nfs's deallocate implementation easier.)

This adds supports for hole-punching facilities in the FreeBSD kernel
starting from __FreeBSD_version 1400032.

Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
@khng300 khng300 changed the title Add hole punching support on FreeBSD version 1400030 Add hole punching support on FreeBSD version 1400032 Aug 25, 2021
@khng300
Copy link
Contributor Author

khng300 commented Aug 26, 2021

@freqlabs Just in case, zts's fallocate-punch_hole could be tested on -CURRENT 20210826.

@khng300
Copy link
Contributor Author

khng300 commented Aug 27, 2021

@behlendorf Is it possible to retrigger the CI as well? I checked that the -CURRENT buildbot should be on snapshot 20210826 since yesterday.

@behlendorf
Copy link
Contributor

@khng300 in fact it is. I went ahead and resubmitted the builds for all the FreeBSD builders.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and all of the FreeBSD bots are passing.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Aug 30, 2021
behlendorf pushed a commit that referenced this pull request Aug 30, 2021
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes #12458
rincebrain pushed a commit to rincebrain/zfs that referenced this pull request Sep 22, 2021
This adds supports for hole-punching facilities in the FreeBSD kernel
starting from __FreeBSD_version 1400032.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes openzfs#12458
rincebrain pushed a commit to rincebrain/zfs that referenced this pull request Sep 22, 2021
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes openzfs#12458
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Feb 10, 2022
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes openzfs#12458
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Feb 14, 2022
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes openzfs#12458
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Feb 16, 2022
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes openzfs#12458
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Feb 17, 2022
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes openzfs#12458
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request May 16, 2022
This adds supports for hole-punching facilities in the FreeBSD kernel
starting from __FreeBSD_version 1400032.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes openzfs#12458
behlendorf pushed a commit that referenced this pull request May 17, 2022
This adds supports for hole-punching facilities in the FreeBSD kernel
starting from __FreeBSD_version 1400032.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes #12458
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants