git.proxmox.com Git - pve-qemu.git/log

]> git.proxmox.com Git - pve-qemu.git/log

projects / pve-qemu.git / log

commit | commitdiff | tree

Thomas Lamprecht [Fri, 6 Sep 2024 14:22:13 +0000 (16:22 +0200)]

bump version to 9.0.2-3

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 5 Sep 2024 09:49:53 +0000 (11:49 +0200)]

pick up stable fixes for 9.0

Includes fixes for VirtIO-net, ARM and x86(_64) emulation, CVEs to
harden NBD server against malicious clients, as well as a few others
(VNC, physmem, Intel IOMMU, ...).

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 5 Sep 2024 09:49:52 +0000 (11:49 +0200)]

pick up fix for VirtIO PCI regressions

Commit f06b222 ("fixes for QEMU 9.0") included a revert for the QEMU
commit 2ce6cff94d ("virtio-pci: fix use of a released vector"). That
commit caused some regressions which sounded just as bad as the fix.
Those regressions have now been addressed upstream, so pick up the fix
and drop the revert. Dropping the revert fixes the original issue that
commit 2ce6cff94d ("virtio-pci: fix use of a released vector")
addressed.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 7 Aug 2024 08:17:15 +0000 (10:17 +0200)]

bump version to 9.0.2-2

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 7 Aug 2024 07:42:18 +0000 (09:42 +0200)]

actually bump submodule to v9.0.2

Fixes: cf40e92 ("update submodule and patches to QEMU 9.0.2")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 29 Jul 2024 16:59:45 +0000 (18:59 +0200)]

bump version to 9.0.2-1

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 25 Jul 2024 09:45:54 +0000 (11:45 +0200)]

some more stable fixes for QEMU 9.0.2

Fix the two issues reported in the community forum[0][1], i.e.
regression in LSI-53c895a controller and ignored boot order for USB
storage (only possible via custom arguments in Proxmox VE), both
causing boot failures, and pick up fixes for VirtIO, ARM emulation,
char IO device and a graph lock fix for the block layer.

The block-copy patches that serve as a preparation for fleecing are
moved to the extra folder, because the graph lock fix requires them
to be present first. They have been applied upstream in the meantime
and should drop out with the rebase on 9.1.

[0]: https://forum.proxmox.com/threads/149772/post-679433
[1]: https://forum.proxmox.com/threads/149772/post-683459

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 25 Jul 2024 09:45:53 +0000 (11:45 +0200)]

update submodule and patches to QEMU 9.0.2

Most relevant are some fixes for VirtIO and for ARM and i386
emulation. There also is a fix for VGA display to fix screen blanking,
which fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=4786

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 8 Jul 2024 14:13:44 +0000 (16:13 +0200)]

bump version to 9.0.0-6

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 8 Jul 2024 10:09:20 +0000 (12:09 +0200)]

zeroinit: fix regression with filename parsing

As reported in the community forum [0], cloning or importing images
to RBD storages (without the krbd setting) was broken. This is a
result of no filename parsing happening anymore in bdrv_open_child()
after commit b242e7f ("backport fix for CVE-2024-4467"), which the
zeroinit relied on for passing along the RBD filename+key-value pairs.

There is a dedicated function for opening the file child which still
does filename parsing. Use that for opening the file child. Role and
flags should still be the same as with the manual bdrv_open_child(),
because the zeroinit driver is a filter, and the assignment bs->file
is also done by bdrv_open_file_child().

Fixes: b242e7f ("backport fix for CVE-2024-4467")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
[0]: https://forum.proxmox.com/threads/qemu-9-0-available-on-pve-no-subscription-as-of-now.149772/post-681620
FG: added missing link

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 3 Jul 2024 11:20:03 +0000 (13:20 +0200)]

bump version to 9.0.0-5

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 3 Jul 2024 11:03:49 +0000 (13:03 +0200)]

backport fix for CVE-2024-4467

This prevents that malicious qcow2 images can already cause bad
effects if being queried via 'qemu-img info'.

For Proxmox VE, this is an additional safe guard, as currently it
directly creates and manages the qcow2 images used by VMs and does not
allow unprivileged users to import them.

Reference: https://access.redhat.com/security/cve/cve-2024-4467

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 14 Jun 2024 11:00:42 +0000 (13:00 +0200)]

fix #4726: avoid superfluous check in vma code

The 'status' pointer is dereferenced regardless of the NULL check,
i.e. 'status->closed' is accessed after the branch with the check.
Since all callers pass in the address of a struct on the stack, the
pointer can never be NULL. Remove the superfluous check and add an
assert instead.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 1 Jul 2024 09:32:49 +0000 (11:32 +0200)]

bump version to 9.0.0-4

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 28 Jun 2024 08:46:56 +0000 (10:46 +0200)]

async snapshot: fix crash with VirtIO block with iothread when not saving VM state

As reported in the community forum [0], doing a snapshot without
saving the VM state for a VM with a VirtIO block device with iothread
would lead to an assertion failure [1] and thus crash.

The issue is that vm_start() is called from the coroutine
qmp_savevm_end() which violates assumptions about graph locking down
the line. Factor out the part of qmp_savevm_end() that actually needs
to be a coroutine into a separate helper and turn qmp_savevm_end()
into a non-coroutine, so that it can call vm_start() safely.

The issue is likely not new, but was exposed by the recent graph
locking rework introducing stricter checks.

The issue does not occur when saving the VM state, because then the
non-coroutine process_savevm_finalize() will already call vm_start()
before qmp_savevm_end().

[0]: https://forum.proxmox.com/threads/149883/

[1]:

> #0  0x00007353e6096e2c __pthread_kill_implementation (libc.so.6 + 0x8ae2c)
> #1  0x00007353e6047fb2 __GI_raise (libc.so.6 + 0x3bfb2)
> #2  0x00007353e6032472 __GI_abort (libc.so.6 + 0x26472)
> #3  0x00007353e6032395 __assert_fail_base (libc.so.6 + 0x26395)
> #4  0x00007353e6040eb2 __GI___assert_fail (libc.so.6 + 0x34eb2)
> #5  0x0000592002307bb3 bdrv_graph_rdlock_main_loop (qemu-system-x86_64 + 0x83abb3)
> #6  0x00005920022da455 bdrv_change_aio_context (qemu-system-x86_64 + 0x80d455)
> #7  0x00005920022da6cb bdrv_try_change_aio_context (qemu-system-x86_64 + 0x80d6cb)
> #8  0x00005920022fe122 blk_set_aio_context (qemu-system-x86_64 + 0x831122)
> #9  0x00005920021b7b90 virtio_blk_start_ioeventfd (qemu-system-x86_64 + 0x6eab90)
> #10 0x0000592002022927 virtio_bus_start_ioeventfd (qemu-system-x86_64 + 0x555927)
> #11 0x0000592002066cc4 vm_state_notify (qemu-system-x86_64 + 0x599cc4)
> #12 0x000059200205d517 vm_prepare_start (qemu-system-x86_64 + 0x590517)
> #13 0x000059200205d56b vm_start (qemu-system-x86_64 + 0x59056b)
> #14 0x00005920020a43fd qmp_savevm_end (qemu-system-x86_64 + 0x5d73fd)
> #15 0x00005920023f3749 qmp_marshal_savevm_end (qemu-system-x86_64 + 0x926749)
> #16 0x000059200242f1d8 qmp_dispatch (qemu-system-x86_64 + 0x9621d8)
> #17 0x000059200238fa98 monitor_qmp_dispatch (qemu-system-x86_64 + 0x8c2a98)
> #18 0x000059200239044e monitor_qmp_dispatcher_co (qemu-system-x86_64 + 0x8c344e)
> #19 0x000059200245359b coroutine_trampoline (qemu-system-x86_64 + 0x98659b)
> #20 0x00007353e605d9c0 n/a (libc.so.6 + 0x519c0)

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 14 Jun 2024 13:15:14 +0000 (15:15 +0200)]

PVE backup: remove unused targetfile member from device info

This became unused after 9e0186f ("backup: drop broken
BACKUP_FORMAT_DIR").

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 14 Jun 2024 11:45:33 +0000 (13:45 +0200)]

remove outdated comments about AioContext locking

AioContext locking got removed in QEMU 9.0.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 10 Jun 2024 13:38:51 +0000 (15:38 +0200)]

pbs block driver: use custom error message when returned aid is too large

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Jing Luo [Mon, 10 Jun 2024 12:05:26 +0000 (21:05 +0900)]

pbs block driver: improve data type for aid member

On ARM, gcc warns (-Werror=type-limits) that it will always be false
for the if statement. This is because here s->aid is defined as char,
while proxmox_restore_open_image() returns an int.

This is probably because chars are treated as unsigned on arm arch but
signed on x86 arch:

https://developer.arm.com/documentation/den0013/d/Porting/Miscellaneous-C-porting-issues/unsigned-char-and-signed-char

Make aid an explicit uint8_t, because that is the type for functions
taking the aid as a parameter, e.g. proxmox_restore_get_image_length().

Signed-off-by: Jing Luo <jing@jing.rocks>
[FE: slightly improve commit message]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 29 May 2024 14:02:36 +0000 (16:02 +0200)]

bump version to 9.0.0-3

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 29 May 2024 10:53:17 +0000 (12:53 +0200)]

more stable fixes for QEMU 9.0

Most importantly the first one "Revert "monitor: use
aio_co_reschedule_self()"", fixing a crash when doing hotplug+resize
with a disk using io_uring.

Other fixes (likely not too important) for TCG emulation of x86(_64)
and ARM.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 17 May 2024 15:05:10 +0000 (17:05 +0200)]

bump version to 9.0.0-2

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 17 May 2024 08:44:57 +0000 (10:44 +0200)]

fixes for QEMU 9.0

Most importantly, fix forwards and backwards migration with VirtIO-GPU
display.

Other fixes are for a regression in pflash device (introduced in 8.2)
and some fixes for x86(_64) TCG emulation. One of the patches needed
to be adapted, because it removed a helper that is still in use in
9.0.0.

There also is a revert for a fix in VirtIO PCI devices that turned out
to cause some issues, see the revert itself for more details.

Lastly, there is a change to move compatibility flags for a new
VirtIO-net feature to the correct machine type. The feature was
introduced in QEMU 8.2, but the compatibility flags got added to
machine version 8.0 instead of 8.1. This breaks backwards migration
with machine version 8.1 from a 8.2/9.0 binary to an 8.1 binary, in
cases where the guest kernel enables the feature (e.g. Ubuntu 23.10).
While that breaks migration with machine version 8.1 from an unpatched
to a patched binary, Proxmox VE only ever had 8.2 on the test
repository and 9.0 not yet in any public repository. An upstream
developer suggested it is the proper fix [0]. Upstream submission [1].

[0]: https://lore.kernel.org/qemu-devel/CACGkMEtZrJuhof+hUGVRvLLQE+8nQE5XmSHpT0NAQ1EpnqfmsA@mail.gmail.com/T/#u
[1]: https://lore.kernel.org/qemu-devel/20240517075336.104091-1-f.ebner@proxmox.com/T/#u

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 29 Apr 2024 15:20:22 +0000 (17:20 +0200)]

backup: improve error when copy-before-write fails for fleecing

With fleecing, failure for copy-before-write does not fail the guest
write, but only sets the snapshot error that is associated to the
copy-before-write filter, making further requests to the snapshot
access fail with EACCES, which then also fails the job. But that error
code is not the root cause of why the backup failed, so bubble up the
original snapshot error instead.

Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 29 Apr 2024 15:20:21 +0000 (17:20 +0200)]

fix #5409: backup: fix copy-before-write timeout

The type for the copy-before-write timeout in nanoseconds was wrong.
By being just uint32_t, a maximum of slightly over 4 seconds was
possible. Larger values would overflow and thus the 45 seconds set by
Proxmox's backup with fleecing, resulted in effectively 2 seconds
timeout for copy-before-write operations.

Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 29 Apr 2024 08:51:43 +0000 (10:51 +0200)]

bump version to 9.0.0-1

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 25 Apr 2024 15:21:29 +0000 (17:21 +0200)]

update submodule and patches to QEMU 9.0.0

Biggest change is that AioContext locking got removed, but no changes
required other than dropping the calls to acquire and release it. As a
consequence, the single parameter for the bdrv_graph_wrlock() call got
removed which also required adaptation.

QAPI docs became stricter requiring to document all members.

Other minor changes:

- Single parameter from migration_is_running() was dropped.
- qemu_mutex_(un)lock_iothread() got renamed to bql_(un)lock().

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 29 Apr 2024 08:53:43 +0000 (10:53 +0200)]

d/lintian: ignore missing source warning for linux-user vdso objects

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Sat, 27 Apr 2024 10:44:32 +0000 (12:44 +0200)]

bump version to 8.2.2-1

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 25 Apr 2024 15:21:28 +0000 (17:21 +0200)]

update submodule and patches to QEMU 8.2.2

This version includes both the AioContext lock and the block graph
lock, so there might be some deadlocks lurking. It's not possible to
disable the block graph lock like was done in QEMU 8.1, because there
are no changes like the function bdrv_schedule_unref() that require
it. QEMU 9.0 will finally get rid of the AioContext locking.

During live-restore with a VirtIO SCSI drive with iothread there is a
known racy deadlock related to the AioContext lock. Not new [1], but
not sure if more likely now. Should be fixed in QEMU 9.0.

The block graph lock comes with annotations that can be checked by
clang's TSA. This required changes to the block drivers, i.e.
alloc-track, pbs, zeroinit as well as taking the appropriate locks
in pve-backup, savevm-async, vma-reader.

Local variable shadowing is prohibited via a compiler flag now,
required slight adaptation in vma.c.

Major changes only affect alloc-track:

* It is not possible to call a generated co-wrapper like
  bdrv_get_info() while holding the block graph lock exclusively [0],
  which does happen during initialization of alloc-track when the
  backing hd is set and the refresh_limits driver callback is invoked.

  The bdrv_get_info() call to get the cluster size is moved to
  directly after opening the file child in track_open().

  The important thing is that at least the request alignment for the
  write target is used, because then the RMW cycle in bdrv_pwritev
  will gather enough data from the backing file. Partial cluster
  allocations in the target are not a fundamental issue, because the
  driver returns its allocation status based on the bitmap, so any
  other data that maps to the same cluster will still be copied later
  by a stream job (or during writes to that cluster).

* Replacing the node cannot be done in the
  track_co_change_backing_file() callback, because it is a coroutine
  and cannot hold the block graph lock exclusively. So it is moved to
  the stream job itself with the auto-remove option not having an
  effect anymore (qemu-server would always set it anyways).

  In the future, there could either be a special option for the stream
  job, or maybe the upcoming blockdev-replace QMP command can be used.

  Replacing the backing child is actually already done in the stream
  job, so no need to do it in the track_co_change_backing_file()
  callback. It also cannot be called from a coroutine. Looking at the
  implementation in the qcow2 driver, it doesn't seem to be intended
  to change the backing child itself, just update driver-internal
  state.

Other changes:

* alloc-track: Error out early when used without auto-remove. Since
  replacing the node now happens in the stream job, where the option
  cannot be read from (it's internal to the driver), it will always be
  treated as 'on'. Makes sure to have users beside qemu-server notice
  the change (should they even exist). The option can be fully dropped
  in the future while adding a version guard in qemu-server.

* alloc-track: Avoid seemingly superfluous child permission update.
  Doesn't seem necessary nowadays (maybe after commit "alloc-track:
  fix deadlock during drop" where the dropping is not rescheduled and
  delayed anymore or some upstream change). Replacing the block node
  will already update the permissions of the new node (which was the
  file child before). Should there really be some issue, instead of
  having a drop state, this could also be just based off the fact
  whether there is still a backing child.

  Dumping the cumulative (shared) permissions for the BDS with a debug
  print yields the same values after this patch and with QEMU 8.1,
  namely 3 and 5.

* PBS block driver: compile unconditionally. Proxmox VE always needs
  it and something in the build process changed to make it not enabled
  by default. Probably would need to move the build option to meson
  otherwise.

* backup: job unreferencing during cleanup needs to happen outside of
  coroutine, so it was moved to before invoking the clean

* mirror: Cherry-pick stable fix to avoid potential deadlock.

* savevm-async: migrate_init now can fail, so propagate potential
  error.

* savevm-async: compression counters are not accessible outside
  migration/ram-compress now, so drop code that prophylactically set
  it to zero.

[0]: https://lore.kernel.org/qemu-devel/220be383-3b0d-4938-b584-69ad214e5d5d@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/e13b488e-bf13-44f2-acca-e724d14f43fd@proxmox.com/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 25 Apr 2024 15:21:27 +0000 (17:21 +0200)]

makefile: also filter 64-bit hppa ROM for QEMU 8.2

Same rationale as 6facdf3 ("also exclude hppa-firmware.img ROM from
build"), not used by Proxmox VE and would cause a failure during
build.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 25 Apr 2024 15:21:26 +0000 (17:21 +0200)]

makefile: adapt firmware blob removal to changes for QEMU 8.2

Namely, it's also necessary to remove .dts source files from the
meson.build file, because the .dtb file names are not directly listed
anymore since commit 6e0dc9d2a8 ("meson: compile bundled device
trees").

The same commit also introduced a "'.dtb'" in a line not just listing
a file name and removing that line would break the script. Be more
precise and require an alphanumeric character before the suffix.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 12 Apr 2024 12:26:40 +0000 (14:26 +0200)]

Makefile: drop -j option from dpkg-buildpackage

From man dpkg-buildpackage:

> -j, --jobs[=jobs|auto]
> Specifies the number of jobs allowed to be run simultaneously (since
> dpkg 1.14.7, long option since dpkg 1.18.8). The number of jobs
> matching the number of online processors if auto is specified (since
> dpkg 1.17.10), or unlimited number if jobs is not specified. The
> default behavior is auto (since dpkg 1.18.11) in non-forced mode
> (since dpkg 1.21.10), and as such it is always safer to use with any
> package including those that are not parallel-build safe.

The option was added in the Makefile by commit 4ba321f ("build qemu
multithreaded") which states:

> same as in pve-kernel where we have --jobs=auto

But according to the man page, -j without an argument is not the same
and means unlimited. Using the number of online cores seems more
sensible and was the original intention. Again, according to the man
page, the default is auto since dpkg 1.18.11 (or Debian Stretch), so
just drop the option.

The motivation to look into this was that after the recent upstream
commit d1ce2cc95b ("Makefile: preserve --jobserver-auth argument when
calling ninja") having -j as the make flag would be broken as it was
mistakenly passed to ninja (for which the argument for -j is not
optional). Should get fixed soon [0].

[0]: https://lore.kernel.org/qemu-devel/20240412100401.20047-2-pbonzini@redhat.com/T/#u

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 11 Apr 2024 15:46:52 +0000 (17:46 +0200)]

bump version to 8.1.5-5

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 11 Apr 2024 15:38:26 +0000 (17:38 +0200)]

implement support for backup fleecing

Excerpt from Fiona's v3 cover-letter [0]:

When a backup for a VM is started, QEMU will install a
"copy-before-write" filter in its block layer. This filter ensures
that upon new guest writes, old data still needed for the backup is
sent to the backup target first. The guest write blocks until this
operation is finished so guest IO to not-yet-backed-up sectors will be
limited by the speed of the backup target.

With backup fleecing, such old data is cached in a fleecing image
rather than sent directly to the backup target. This can help guest IO
performance and even prevent hangs in certain scenarios, at the cost
of requiring more storage space.

With this series it will be possible to enable backup-fleecing via
e.g. `vzdump 123 --fleecing enabled=1,storage=local-lvm` with fleecing
images created on the storage `local-lvm`. The fleecing storage should
be a fast local storage which supports thin-provisioning and discard.
If the storage supports qcow2, that is used as the fleecing image
format. If the underlying file system does not support discard, with
qcow2 and preallocation=off, at least already allocated parts of the
image can be re-used later.

Fleecing images are created by qemu-server via pve-storage and
attached to QEMU before the backup starts, and cleaned up after the
backup finished or failed. The naming schema for fleecing images is
'vm-ID-fleece-N(.FORMAT)'. The allocated images are recorded in the
guest configuration, so that even after a hard failure, clean-up can
be re-attempted. While not too bad, it's a non-trivial amount of code
and I'm not 100% sure about the cost-benefit, so sending those as RFC.

The fleecing image needs to be the exact same size as the source, but
luckily, an explicit size can be specified when attaching a raw image
to QEMU so there are no size issues when using storages that have
coarser allocation/round up. For qcow2, it seems that virtual size can
be nearly arbitrary (i.e. modulo 512 byte granularity) during
allocation.

[0]: https://lists.proxmox.com/pipermail/pve-devel/2024-April/062815.html

Originally-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 12 Mar 2024 13:08:48 +0000 (14:08 +0100)]

bump version to 8.1.5-4

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 12 Mar 2024 12:54:59 +0000 (13:54 +0100)]

backup: factor out & clean up gathering device info into helper

Squash the two original patches [0][1] from Fiona, which got send
separate to be easier to review, into the big patch that adds the
Proxmox backup integration.

[0]: https://lists.proxmox.com/pipermail/pve-devel/2024-January/061479.html
[1]: https://lists.proxmox.com/pipermail/pve-devel/2024-January/061478.html

Originally-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 9 Jan 2024 14:10:00 +0000 (15:10 +0100)]

backup: avoid bubbling up first ECANCELED error

With pvebackup_propagate_error(), the first error wins. When one job
in the transaction fails, it is expected that later jobs get the
ECANCELED error. Those are not interesting and by skipping them a more
interesting error, which is likely the actual root cause, can win.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 9 Jan 2024 14:09:59 +0000 (15:09 +0100)]

cleanup: squash backup dump driver change into patch introducing the driver

Makes it simpler and shorter. Still results in the same code after
applying both patches in question.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 12 Mar 2024 08:47:50 +0000 (09:47 +0100)]

fix patch for accepting NULL qiov when padding

All callers of the function pass an address, so dereferencing once
before checking for NULL is required. It's also necessary to update
bytes and offset nevertheless, so the request will actually be aligned
later and not trigger an assertion failure.

Seems like this was accidentally broken in 8dca018 ("udpate and rebase
to QEMU v6.0.0") and this is effectively a revert to the original
version of the patch. The qiov functions changed back then, which
might've been the reason Stefan tried to simplify the patch.

Should fix live-import for certain kinds of VMDK images.

Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 21 Feb 2024 19:11:27 +0000 (20:11 +0100)]

bump version to 8.1.5-3

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 21 Feb 2024 13:01:52 +0000 (14:01 +0100)]

add patch to fix deadlock with VirtIO block and iothread during QMP stop

Backported from commit bfa36802d1 ("virtio-blk: avoid using ioeventfd
state in irqfd conditional") because the rework/rename dataplane ->
ioeventfd didn't happen yet.

Reported in the community forum [0] and reproduced doing a backup loop
to PBS with suspend mode with fio doing heavy IO in the guest and
using an RBD storage (with krbd).

[0]: https://forum.proxmox.com/threads/141320

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 5 Feb 2024 13:13:17 +0000 (14:13 +0100)]

fix #4507: add patch to automatically increase NOFILE soft limit

In many configurations, e.g. multiple vNICs with multiple queues or
with many Ceph OSDs, the default soft limit of 1024 is not enough.
QEMU is supposed to work fine with file descriptors >= 1024 and does
not use select() on POSIX. Bump the soft limit to the allowed hard
limit to avoid issues with the aforementioned configurations.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 2 Feb 2024 18:41:31 +0000 (19:41 +0100)]

bump version to 8.1.5-2

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 2 Feb 2024 18:35:31 +0000 (19:35 +0100)]

work around stuck guest IO with iothread and VirtIO block/SCSI

This essentially repeats commit 6b7c181 ("add patch to work around
stuck guest IO with iothread and VirtIO block/SCSI") with an added
fix for the SCSI event virtqueue, which requires special handling.
This is to avoid the issue [3] that made the revert 2a49e66 ("Revert
"add patch to work around stuck guest IO with iothread and VirtIO
block/SCSI"") necessary the first time around.

When using iothread, after commits
1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()")
766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()")
it can happen that polling gets stuck when draining. This would cause
IO in the guest to get completely stuck.

A workaround for users is stopping and resuming the vCPUs because that
would also stop and resume the dataplanes which would kick the host
notifiers.

This can happen with block jobs like backup and drive mirror as well
as with hotplug [2].

Reports in the community forum that might be about this issue[0][1]
and there is also one in the enterprise support channel.

As a workaround in the code, just re-enable notifications and kick the
virt queue after draining. Draining is already costly and rare, so no
need to worry about a performance penalty here.

Take special care to attach the SCSI event virtqueue host notifier
with the _no_poll() variant like in virtio_scsi_dataplane_start().
This avoids the issue from the first attempted fix where the iothread
would suddenly loop with 100% CPU usage whenever some guest IO came in
[3]. This is necessary because of commit 38738f7dbb ("virtio-scsi:
don't waste CPU polling the event virtqueue"). See [4] for the
relevant discussion.

[0]: https://forum.proxmox.com/threads/137286/
[1]: https://forum.proxmox.com/threads/137536/
[2]: https://issues.redhat.com/browse/RHEL-3934
[3]: https://forum.proxmox.com/threads/138140/
[4]: https://lore.kernel.org/qemu-devel/bfc7b20c-2144-46e9-acbc-e726276c5a31@proxmox.com/

Link: https://lore.kernel.org/qemu-devel/20240202153158.788922-1-hreitz@redhat.com/
Originally-by: Fiona Ebner <f.ebner@proxmox.com>
[ TL: Update to v2 and rebased patch series handling to v8.1.5 ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 2 Feb 2024 18:08:16 +0000 (19:08 +0100)]

bump version to 8.1.5-1

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 30 Jan 2024 14:14:38 +0000 (15:14 +0100)]

stable fixes for corner case in i386 emulation and crash with VNC clipboard

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 30 Jan 2024 14:14:37 +0000 (15:14 +0100)]

update submodule and patches to QEMU 8.1.5

Most notable fixes from a Proxmox VE perspective are:

* "virtio-net: correctly copy vnet header when flushing TX"
  To prevent a stack overflow that could lead to leaking parts of the
  QEMU process's memory.
* "hw/pflash: implement update buffer for block writes"
  To prevent an edge case for half-completed writes. This potentially
  affected EFI disks.
* Fixes to i386 emulation and ARM emulation.

No changes for patches were necessary (all are just automatic context
changes).

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 15 Dec 2023 13:24:58 +0000 (14:24 +0100)]

bump version to 8.1.2-6

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 15 Dec 2023 13:10:35 +0000 (14:10 +0100)]

Revert "add patch to work around stuck guest IO with iothread and VirtIO block/SCSI"

This reverts commit 6b7c1815e1c89cb66ff48fbba6da69fe6d254630.

The attempted fix has been reported to cause high CPU usage after
backup [0]. Not difficult to reproduce and it's iothreads getting
stuck in a loop. Downgrading to pve-qemu-kvm=8.1.2-4 helps which was
also verified by Christian, thanks! The issue this was supposed to fix
is much rarer, so revert for now, while upstream is still working on a
proper fix.

[0]: https://forum.proxmox.com/threads/138140/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 11 Dec 2023 15:59:16 +0000 (16:59 +0100)]

bump version to 8.1.2-5

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 11 Dec 2023 13:28:39 +0000 (14:28 +0100)]

pick fix for potential deadlock with QMP resize and iothread

While the patch gives bdrv_graph_wrlock() as an example where the
issue can manifest, something similar can happen even when that is
disabled. Was able to reproduce the issue with
while true; do qm resize 115 scsi0 +4M; sleep 1; done
while running
fio --name=make-mirror-work --size=100M --direct=1 --rw=randwrite \
--bs=4k --ioengine=psync --numjobs=5 --runtime=1200 --time_based
in the VM.

Fix picked up from:
https://lists.nongnu.org/archive/html/qemu-devel/2023-12/msg01102.html

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 11 Dec 2023 13:28:38 +0000 (14:28 +0100)]

add patch to work around stuck guest IO with iothread and VirtIO block/SCSI

When using iothread, after commits
1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()")
766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()")
it can happen that polling gets stuck when draining. This would cause
IO in the guest to get completely stuck.

A workaround for users is stopping and resuming the vCPUs because that
would also stop and resume the dataplanes which would kick the host
notifiers.

This can happen with block jobs like backup and drive mirror as well
as with hotplug [2].

Reports in the community forum that might be about this issue[0][1]
and there is also one in the enterprise support channel.

As a workaround in the code, just re-enable notifications and kick the
virt queue after draining. Draining is already costly and rare, so no
need to worry about a performance penalty here. This was taken from
the following comment of a QEMU developer [3] (in my debugging,
I had already found re-enabling notification to work around the issue,
but also kicking the queue is more complete).

[0]: https://forum.proxmox.com/threads/137286/
[1]: https://forum.proxmox.com/threads/137536/
[2]: https://issues.redhat.com/browse/RHEL-3934
[3]: https://issues.redhat.com/browse/RHEL-3934?focusedId=23562096&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-23562096

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 22 Nov 2023 13:28:25 +0000 (14:28 +0100)]

bump version to 8.1.2-4

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 22 Nov 2023 12:41:14 +0000 (13:41 +0100)]

add fix for vnc clipboard

This fixes the host->guest direction with noNVC as a client (and
likely others).

Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Friedrich Weber <f.weber@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 20 Nov 2023 09:35:10 +0000 (10:35 +0100)]

bump version to 8.1.2-3

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 20 Nov 2023 09:16:17 +0000 (10:16 +0100)]

fix #5054: backport fix for software reset with SATA

The issue prevented FreeBSD 14 VMs with SATA disk from booting.

The commit it fixes e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document
PxCI handling") is part of stable 8.1.2.

The patch was already applied to the block branch upstream:
https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02711.html

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Friedrich Weber <f.weber@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 17 Nov 2023 10:55:26 +0000 (11:55 +0100)]

bump version to 8.1.2-2

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 17 Nov 2023 10:45:41 +0000 (11:45 +0100)]

revert commit breaking VirtIO network adapters for certain versions of Windows

As reported in the community forum [0] and reproduced locally this
breaks VirtIO network adapters in (at least) the German ISO of Windows
Server 2022. The fix itself was for

> Issue is not fatal but as result acpi-index/"PCI Label ID" property
> is either not shown in device details page or shows incorrect value.

so revert and tolerate that as a stop-gap, rather than have the
devices not working at all.

[0]: https://forum.proxmox.com/threads/92094/post-605684

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 7 Nov 2023 14:28:24 +0000 (15:28 +0100)]

fix #4710: vma create: don't use O_DIRECT for tmpfs

The implementation of the helper is_path_tmpfs() is similar to the
existing qemu_fd_getfs() function in util/mmap-alloc.c, which
unfortunately only takes an existing fd.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 24 Oct 2023 11:43:10 +0000 (13:43 +0200)]

bump version to 8.1.2-1

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 24 Oct 2023 12:59:49 +0000 (14:59 +0200)]

d/control: add python3-venv as build-dependency

Seems to be required since commit 81e2b198a8 ("configure: create a
python venv unconditionally").

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 17 Oct 2023 12:10:12 +0000 (14:10 +0200)]

d/control: add versioned Breaks for qemu-server <= 8.0.6

Upstream QEMU commit 4271f40383 ("virtio-net: correctly report maximum
tx_queue_size value") made setting an invalid tx_queue_size for a
non-vDPA/vhost-user net device a hard error. Now, qemu-server before
commit 089aed81 ("cfg2cmd: netdev: fix value for tx_queue_size") did
just that, so the newer QEMU version would break start-up for most VMs
(a default vNIC configuration would be affected).

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 17 Oct 2023 12:10:11 +0000 (14:10 +0200)]

add patch to avoid huge snapshot performance regression

Taking a snapshot became prohibitively slow because of the
migration_transferred_bytes() call in migration_rate_exceeded() [0].

This also applied to the async snapshot taking in Proxmox VE, so
work around the issue until it is fixed upstream.

[0]: https://gitlab.com/qemu-project/qemu/-/issues/1821

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 17 Oct 2023 12:10:10 +0000 (14:10 +0200)]

add patch to disable graph locking

There are still some issues with graph locking, e.g. deadlocks during
backup canceling [0] and initial attempts to fix it didn't work [1].
Because the AioContext locks still exist, it should still be safe to
disable graph locking.

[0]: https://lists.nongnu.org/archive/html/qemu-devel/2023-09/msg00729.html
[1]: https://lists.nongnu.org/archive/html/qemu-devel/2023-09/msg06905.html

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 17 Oct 2023 12:10:09 +0000 (14:10 +0200)]

update submodule and patches to QEMU 8.1.2

Bigger notable changes:

* Commit 1a30b0f5d7 ("block: .bdrv_open is non-coroutine and
  unlocked") broke the PVE backup patches, in particular setting up
  the backup dump block driver, because bdrv_new_open_driver() cannot
  be called from a coroutine. To fix it, bdrv_co_open() is used
  instead, and while it's a much more involved function, the result
  should be essentially the same. The only difference I noticed is
  that the BDRV_O_ALLOW_RDWR flag is also set in the resulting bds
  (block driver state), but that shouldn't hurt.

Smaller notable changes:

* aio_set_fd_handler() dropped its 'is_external' parameter stating
  that all callers now pass false in 60f782b6b7 ("aio: remove
  aio_disable_external() API"). The calls in the PVE patches also
  passed false, so just drop the parameter too.

* global_state_store() does not have a return value anymore, so the
  user in the PVE savevm-async patch was adapted. For context, see
  c33f1829f8 ("migration: never fail in global_state_store()").

* Renames affecting the PVE savevm-async patch:
  migrate_use_block() -> migrate_block() and ram_counters -> mig_stats
  9d4b1e5f22 ("migration: Move migrate_use_block() to options.c")
  aff3f6606d ("migration: Rename ram_counters to mig_stats")

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 17 Oct 2023 12:10:08 +0000 (14:10 +0200)]

buildsys: use QEMU's keycodemapdb again

instead of the split-out version that was last updated for QEMU 6.0.
This reverts the relevant part of 6838f03 ("bump version to 2.11.1-1")
which doesn't state a reason why the splitting was done. If something
breaks, we can still re-do it and document the reason this time.

Alternatively, it would be necessary to adapt the paths, because
keycodemapdb lives in subprojects/ rather than ui/ since QEMU commit
c53648abba ("meson: use subproject for keycodemapdb").

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 17 Oct 2023 12:10:07 +0000 (14:10 +0200)]

buildsys: fixup submodule target

It's not enough to initialize the submodules anymore, as some got
replaced by wrap files, see QEMU commit 2019cabfee ("meson:
subprojects: replace submodules with wrap files").

Download the subprojects during initialization of the QEMU submodule,
so building (without the automagical --enable-download) can succeeed
afterwards.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 17 Oct 2023 12:10:06 +0000 (14:10 +0200)]

d/rules: use disable-download option instead of git-submodules=ignore

See the following QEMU commits for reference:
0c5f3dcbb2 ("configure: add --enable-pypi and --disable-pypi")
ac4ccac740 ("configure: rename --enable-pypi to --enable-download, control subprojects too")
6f3ae23b29 ("configure: remove --with-git-submodules=") removed

The last one removed the option and the closest thing to
git-submodule=ignore is using disable-download. Which will then just
verify that the submodules are present.

Building now will require running either
* Running 'meson subprojects download' in the qemu submodule first.
* Using --enable-download, but then the submodules would be downloaded
for each build (if not already downloaded in the submodule first)
and it's just a bit too surprising if downloads happen during build.

The disable-download option will also disable automatic downloading of
missing Python modules from PyPI. Hopefully, it's enough to add them
as Debian build dependencies when required.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 4 Oct 2023 06:33:39 +0000 (08:33 +0200)]

bump version to 8.0.2-7

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 24 Aug 2023 13:51:11 +0000 (15:51 +0200)]

fix #2874: SATA: avoid unsolicited write to sector 0 during reset

If there is a pending DMA operation during ide_bus_reset(), the fact
that the IDEstate is already reset before the operation is canceled
can be problematic. In particular, ide_dma_cb() might be called and
then use the reset IDEstate which contains the signature after the
reset. When used to construct the IO operation this leads to
ide_get_sector() returning 0 and nsector being 1. This is particularly
bad, because a write command will thus destroy the first sector which
often contains a partition table or similar.

Upstream discussion:
https://lists.nongnu.org/archive/html/qemu-devel/2023-08/msg04239.html

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 8 Sep 2023 09:18:30 +0000 (11:18 +0200)]

vma: avoid compiler warning about incompatible pointer type

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Filip Schauer [Fri, 8 Sep 2023 08:49:07 +0000 (10:49 +0200)]

backup: Fix spelling error in function name

Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
[FE: fixup patch context]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 6 Sep 2023 15:04:04 +0000 (17:04 +0200)]

bump version to 8.0.2-6

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 6 Sep 2023 08:45:12 +0000 (10:45 +0200)]

backup: drop broken BACKUP_FORMAT_DIR

Since upstream QEMU 8.0, it's no longer possible to call
bdrv_img_create() from a coroutine anymore, meaning a backup with the
directory format would crash the QEMU instance.

The feature is only exposed via the monitor and was intended to be
experimental. There were no user reports about the breakage and it
only was noticed during the rebase for QEMU 8.1, because other parts
of the backup code needed adaptation and I decided to check the
BACKUP_FORMAT_DIR case too.

It should not stay in a broken state of course, but avoid the
maintenance cost and just make it a removed feature for Proxmox VE 8
retroactively.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 6 Sep 2023 08:45:11 +0000 (10:45 +0200)]

backup: create jobs in a drained section

With the drive-backup QMP command, upstream QEMU uses a drained
section for the source drive when creating the backup job. Do the same
here to avoid subtle bugs.

There, the drained section extends until after the job is started, but
this cannot be done here for multi-disk backups (could at most start
the first job). The important thing is that the cbw
(copy-before-write) node is in place and the bcs (block-copy-state)
bitmap is initialized, which both happen during job creation (ensured
by the "block/backup: move bcs bitmap initialization to job creation"
PVE patch).

One such bug is one reported in the community forum [0], where using a
drive with iothread can lead to an overlapping block-copy request and
consequently an assertion failure. The block-copy code relies on the
bcs bitmap to determine if a request for a certain range can be
created. Each time a request is created, it resets the bcs bitmap at
that range to indicate that it's being handled.

The duplicate request can happen as follows:
Thread A attaches the cbw node
Thread B creates a request and resets the bitmap at that range
Thread A clears the bitmap and merges it with the PBS bitmap
The merging can lead to the bitmap being set again at the range of
the previous request, so the block-copy code thinks it's fine to
create a request there.
Thread B creates another requests at an overlapping range before the
other request is finished.

The drained section ensures that nothing else can interfere with the
bcs bitmap between attaching the copy-before-write block node and
initialization of the bitmap.

[0]: https://forum.proxmox.com/threads/133149/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 6 Sep 2023 08:45:10 +0000 (10:45 +0200)]

regenerate patch stats

Apparently wasn't correct in 0cff91a ("fix #1534: vma: Add extract
filter for disk images").

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Filip Schauer [Wed, 30 Aug 2023 08:33:47 +0000 (10:33 +0200)]

fix #1534: vma: Add extract filter for disk images

Add a filter to the "vma extract" command. A comma seperated list of
disk images that should be extracted can be passed with the "-d" option.

Example to extract an IDE drive and an SCSI drive from vzdump.vma:

vma extract vzdump.vma -d "drive-ide0,drive-scsi0" extractdir

Signed-off-by: Filip Schauer <f.schauer@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 16 Aug 2023 09:56:49 +0000 (11:56 +0200)]

bump version to 8.0.2-5

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 14 Aug 2023 08:53:19 +0000 (10:53 +0200)]

backup: trim heap after finishing

Reported in the community forum [0]. By default, there can be large
amounts of memory left assigned to the QEMU process after backup.
Likely because of fragmentation, it's necessary to explicitly call
malloc_trim() to tell glibc that it shouldn't keep all that memory
resident for the process.

QEMU itself already does a malloc_trim() in the RCU thread, but that
code path might not be reached (or not for a long time) under usual
operation. The value of 4 MiB for the argument was also copied from
there.

Example with the following configuration:
> agent: 1
> boot: order=scsi0
> cores: 4
> cpu: x86-64-v2-AES
> ide2: none,media=cdrom
> memory: 1024
> name: backup-mem
> net0: virtio=DA:58:18:26:59:9F,bridge=vmbr0,firewall=1
> numa: 0
> ostype: l26
> scsi0: rbd:base-107-disk-0/vm-106-disk-1,size=4302M
> scsihw: virtio-scsi-pci
> smbios1: uuid=b2d4511e-8d01-44f1-afd6-9581b30c24a6
> sockets: 2
> startup: order=2
> virtio0: lvmthin:vm-106-disk-1,iothread=1,size=1G
> virtio1: lvmthin:vm-106-disk-2,iothread=1,size=1G
> virtio2: lvmthin:vm-106-disk-3,iothread=1,size=1G
> vmgenid: 0a1d8751-5e02-449d-977e-c0160e900231

Before the change:

> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:   370948 kB
> root@pve8a1 ~ # vzdump 106 --storage pbs
> (...)
> INFO: Backup job finished successfully
> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS: 2114964 kB

After the change:

> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:   398788 kB
> root@pve8a1 ~ # vzdump 106 --storage pbs
> (...)
> INFO: Backup job finished successfully
> root@pve8a1 ~ # grep VmRSS /proc/$(cat /var/run/qemu-server/106.pid)/status
> VmRSS:   424356 kB

[0]: https://forum.proxmox.com/threads/131339/

Co-diagnosed-by: Friedrich Weber <f.weber@proxmox.com>
Co-diagnosed-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Mon, 14 Aug 2023 08:52:25 +0000 (10:52 +0200)]

refresh patch context

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>

commit | commitdiff | tree

Filip Schauer [Mon, 7 Aug 2023 13:19:42 +0000 (15:19 +0200)]

Add format attributes to function candidates

Add format attributes to functions that take printf-like arguments. This
provides additional compile-time checking that the correct parameters
are passed to the functions.

This fixes compiler warnings generated by the -Wsuggest-attribute=format
flag.

Signed-off-by: Filip Schauer <f.schauer@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 3 Aug 2023 13:56:30 +0000 (15:56 +0200)]

add patch fixing fd leak for vhost

Each pause+resume operation (which is also done as part of taking a VM
snapshot) would increase the number of open file descriptors by the
number of vhost devices (e.g. network devices by default). This could
lead to crashes during backup and surely other issues once the system
limit (default 1024) was reached [0].

[0]: https://forum.proxmox.com/threads/131603/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fabian Grünbichler [Fri, 28 Jul 2023 10:59:10 +0000 (12:59 +0200)]

bump version to 8.0.2-4

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Fri, 28 Jul 2023 09:44:57 +0000 (11:44 +0200)]

add patch fixing resume for snapshot and hibernate with drive with iothread and a dirty bitmap

Not difficult to run into, just have a drive with iothread, take a PBS
backup and then take a snapshot or hibernate. Resuming will fail with
> qemu: qemu_mutex_unlock_impl: Operation not permitted
because of not acquiring the correct AioContext first.

Migration is not affected, because it runs in coroutine context.

Reported in the community forum:
https://forum.proxmox.com/threads/129899/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Thu, 15 Jun 2023 11:59:12 +0000 (13:59 +0200)]

bump version to 8.0.2-3

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 15 Jun 2023 11:39:00 +0000 (13:39 +0200)]

fix checks for drive mirror with bitmap

The QAPI change for QEMU 8.0 dropped redundant has_foo parameters, but
in the blockdev_mirror_common() function (which is not part of the
QAPI itself but called from there) the argument pair was has_bitmap
and bitmap_name rather than has_bitmap and bitmap.

Reported-by: Aaron Lauterer <a.lauterer@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Thu, 15 Jun 2023 11:38:59 +0000 (13:38 +0200)]

regenerate patches

There's still some context changes not covered by earlier series. No
functional change intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Fri, 9 Jun 2023 05:58:59 +0000 (07:58 +0200)]

bump version to 8.0.2-2

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 24 May 2023 13:56:53 +0000 (15:56 +0200)]

drop deprecated custom drive snapshot QMP commands

They are not required anymore since qemu-server >= 5.0-36.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Wed, 24 May 2023 13:56:52 +0000 (15:56 +0200)]

drop patch for custom get_link_status QMP command

There doesn't seem to be any Proxmox VE code using this.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 6 Jun 2023 14:35:20 +0000 (16:35 +0200)]

bump version to 8.0.2-1

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 6 Jun 2023 08:58:50 +0000 (10:58 +0200)]

update reentrancy patches to version in upstream git

The previous version was picked from the mailing list and still had
an object_dynamic_cast call in a hot path, which is avoided with the
version that landed in git.

Also adds a few more exceptions for devices that need reentrancy.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Fiona Ebner [Tue, 6 Jun 2023 08:58:49 +0000 (10:58 +0200)]

update submodule and patches to QEMU 8.0.2

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Wed, 24 May 2023 08:37:07 +0000 (10:37 +0200)]

buildsys: remove edk2 source tree when assembling build-dir

we ship it via pve-edk2-firmware anyway and it only results in bigger
source tar balls and lintian yelling at us due to edk2 not being the
simplest repo to ensure DFSG compat.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 22 May 2023 11:49:22 +0000 (13:49 +0200)]

bump version to 8.0.0-1

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Tue, 23 May 2023 12:09:03 +0000 (14:09 +0200)]

buildsys: avoid handling noopt locally, rather extend CFLAGS

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 22 May 2023 13:23:20 +0000 (15:23 +0200)]

d/rules: add identation for configure switches for readability

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 22 May 2023 13:09:36 +0000 (15:09 +0200)]

d/control: drop obsolete build dependencies

drop autotools-dev, texi2html and texinfo build dependencies, they
are not used and have no effect

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

commit | commitdiff | tree

Thomas Lamprecht [Mon, 22 May 2023 11:51:22 +0000 (13:51 +0200)]

buildsys: auto-generate dbgsym package

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>

QEMU for PVE

RSS Atom