vzdump: common: allow 'job-id' as a parameter without being in schema
'job-id' is passed when a backup as started as a job and will be
passed to the notification system as matchable metadata. It
can be considered 'internal'.
Signed-off-by: Lukas Wagner <l.wagner@proxmox.com> Reviewed-by: Max Carrara <m.carrara@proxmox.com>
replication: snapshot cleanup: only attempt to remove snapshots that exist
Since commit a6f5b35 ("replication: prepare: include volumes without
snapshots in the result"), attempts would be made to remove previous
replication snapshots from volumes on which they didn't exist. This
was noticed by Thomas since the output of a replication test in
pve-manager changed.
The issue is not completely new, i.e. there was no check that the
(previous) replication snapshot acutally exists before attempting
removal during the cleanup phase. Fix the issue by adding such a
check.
The $replicate_snapshots hash is only used for this, so the change
there is fine.
Fixes: a6f5b35 ("replication: prepare: include volumes without snapshots in the result") Reported-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Thomas Lamprecht [Wed, 17 Apr 2024 15:30:52 +0000 (17:30 +0200)]
guest helpers: avoid checking user/token if one can abort all tasks
If the user can already stop all tasks there is no point in spending
some work on every task to check if the user could also stop if
without those powerful permissions.
To avoid to much indentation rework the filter to an early-next style.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Friedrich Weber [Fri, 12 Apr 2024 14:15:49 +0000 (16:15 +0200)]
guest helpers: add helper to abort active guest tasks of a certain type
Given a `(type, user, vmid)` tuple, the helper aborts all tasks of the
given `type` for guest `vmid` that `user` is allowed to abort:
- If `user` has `Sys.Modify` on the node, they can abort any task
- If `user` is an API token, it can abort any task it started itself
- If `user` is a user, they can abort any task started by themselves
or one of their API tokens.
The helper is used to overrule any active qmshutdown/vzshutdown tasks
when attempting to stop a VM/CT (if requested).
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
It's a property string, because that avoids having an implicit
"enabled" as part of a 'fleecing-storage' property. And there likely
will be more options in the future, e.g. threshold/limit for the
fleecing image size.
Storage is non-optional, so the storage choice needs to be a conscious
decision. Can allow for a default later, when a good choice can be
made further down the stack. The original idea with "same storage as
VM disk" is not great, because e.g. for LVM, it would require the same
size as the disk up front.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
[ TL: style fix for whitespace placement in multi-line strings ] Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fiona Ebner [Wed, 13 Dec 2023 14:17:47 +0000 (15:17 +0100)]
abstract config: fix snapshot needed by replication check
Do not pass the cleanup flag to get_replicatable_volumes() which leads
to replicatable volumes that have the replicate setting turned off to
be part of the result.
Instead pass the noerr flag, because things like missing the
storage-level replicate feature should not lead to an error here.
Reported in the community forum:
https://forum.proxmox.com/threads/120910/post-605574
Fiona Ebner [Wed, 13 Dec 2023 14:17:45 +0000 (15:17 +0100)]
replication: prepare: include volumes without snapshots in the result
Note that PVE::Storage::volume_snapshot_info() will fail when a volume
does not exist, so no non-existing volume will end up in the result
(prepare() is only called with volumes that should exist).
This makes it possible to detect a volume without snapshots in the
result of prepare(), and as a consequence, replication will now also
fail early in a situation where source and remote volume both exist,
but (at least) one of them doesn't have any snapshots.
Such a situation can happen, for example, by deleting and re-creating
a volume with the same name on the source side without running
replication after deletion.
Lukas Wagner [Tue, 21 Nov 2023 10:22:05 +0000 (11:22 +0100)]
vzdump: config: add 'notification-mode' param for backup jobs
'legacy-sendmail': Use mailto/mailnotification parameters and send
emails directly.
'notification-system': Always notify via notification system
'auto': Notify via mail if mailto is set, otherwise use notification
system.
The first two will be migrated to the notification system, the second
were part for the first attempt for the new notification system.
The first attempt only ever hit pvetest, so we simply tell the user
to not use the two params.
Fiona Ebner [Fri, 23 Jun 2023 10:08:11 +0000 (12:08 +0200)]
replication: avoid passing removed storages to target
After removing a storage, replication states can still contain
references to it, even if no volume references it anymore.
If a storage does not exist in the storage configuration, the
replication target runs into an error when preparing the job locally.
This error prevents both running and removing the replication job. Fix
it by not passing the invalid storage ID in the first place.
vzdump: add config options for new notification backend
- Add new option 'notification-target'
Allows to select to which endpoint/group notifications shall be sent
- Add new option 'notification-policy'
Replacement for the now deprecated 'mailnotification' option. Mostly
just a rename for consistency, but also adds the 'never' option.
- Mark 'mailnotification' as deprecated in favor of 'notification-policy'
- Clarify that 'mailto' is ignored if 'notification-target' is set
vzdump: use worker aware log_warn from rest environment for warn level
This ensures that the alert counter is incremented when a message
with such a level is logged, and that the task is prominently marked
in the web UI task log.
The log_warn produces the exact same message format for the warn
level, so we can just swap printing to STDERR for the warning level
without any change to the resulting text in the log. Keep printing to
the (on storage saved) backup log-fd as is.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fiona Ebner [Tue, 28 Feb 2023 10:54:07 +0000 (11:54 +0100)]
abstract config: add method to calculate derived properties from a config
HA manager currently needs to know about internal details about the
configs and how the properties are calculated. With this method, those
details are abstracted away, allowing to change the configuration
structure. In particular, QemuConfig's 'memory' can be turned into
a property string without HA manager needing to know about it (once HA
manager switched to using this mehtod).
Fiona Ebner [Wed, 7 Jun 2023 14:54:50 +0000 (16:54 +0200)]
vzdump: config: improve description of ionice setting
The CFQ scheduler was removed with Linux 5.0 and ionice is now used
by the newer BFQ scheduler. Mention what the special value 8 does.
Also mention that for snapshot and suspend mode backups of VMs, the
setting only affects the compressor, because the kvm process is not a
child process of vzdump then and does not inherit the ionice priority.
if a tag is defined, test if user have a specific access to the vlan (or propagate from full bridge acl or zone)
if trunks is defined, we check permissions for each vlan of the trunks
if no tag, test if user have access to full bridge.
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
FG:
- conditionalize check for bridge
- make trunk to tags helper private for now
Thomas Lamprecht [Thu, 16 Mar 2023 10:46:47 +0000 (11:46 +0100)]
config: ensure definedness for iterating pending & snapshot volumes
while it will work as is, autovivification can be a real PITA so this
should make it more robust and might even avoid having the one or
other warning about accessing undef values in logs.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fiona Ebner [Wed, 15 Mar 2023 14:44:22 +0000 (15:44 +0100)]
fix #4572: config: also update volume IDs in pending section
The method is intended to be used in cases where the volumes actually
got renamed (e.g. migration). Thus, updating the volume IDs should of
course also be done for pending changes to avoid changes referring to
now non-existent volumes or even the wrong existing volume.
Thomas Lamprecht [Mon, 21 Nov 2022 07:09:24 +0000 (08:09 +0100)]
tag helpers: add get_unique_tags method for filtering out duplicates
tags must be unique, allow the user some control in how unique (case
sensitive) and honor the ordering settings (even if I doubt any
production setups wants to spent time and $$$ on cautiously
reordering all tags of their dozens to hundreds virtual guests..
Have some duplicate code to avoid checking to much in the loop
itself, as frequent branches can be more expensive.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dominik Csapak [Wed, 16 Nov 2022 15:48:00 +0000 (16:48 +0100)]
GuestHelpers: add tag related helpers
'get_allowed_tags':
returns the allowed tags for the given user
'assert_tag_permissions'
helper to check permissions for tag setting/updating/deleting
for both container and qemu-server
gets the list of allowed tags from the DataCenterConfig and the current
user permissions, and checks for each tag that is added/removed if
the user has permissions to modify it
'normal' tags require 'VM.Config.Options' on '/vms/<vmid>', but not
allowed tags (either limited with 'user-tag-access' or
'privileged-tags' in the datacenter.cfg) requrie 'Sys.Modify' on '/'
Thomas Lamprecht [Sat, 12 Nov 2022 15:25:34 +0000 (16:25 +0100)]
vzdump: handle new jobs.cfg when removing VMIDs from backup jobs
we use the relatively new SectionConfig functionallity of allowing to
parse/write unknown config types, that way we can directly use the
directly available base job plugin for vzdump jobs and update only
those, keeping the other jobs untouched.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fiona Ebner [Mon, 3 Oct 2022 13:52:05 +0000 (15:52 +0200)]
vzdump: add 'performance' property string as a setting
Initially, to be used for tuning backup performance with QEMU.
A few users reported IO-related issues during backup after upgrading
to PVE 7.x and using a modified QEMU build with max-workers reduced to
8 instead of 16 helped them [0].
Also generalizes the way vzdump property string are handled for easier
extension in the future.
replication: avoid "expected snapshot missing" warning when irrelevant
Only print it when there is a snapshot that would've been removed
without the safeguard. Mostly relevant when a new volume is added to
an already replicated guest.
Fixes replication tests in pve-manager.
Fixes: c0b2948 ("replication: prepare: safeguard against removal if expected snapshot is missing") Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Fiona Ebner [Mon, 13 Jun 2022 10:29:59 +0000 (12:29 +0200)]
replication: prepare: safeguard against removal if expected snapshot is missing
Such a check would also have prevented the issue in 1aa4d84
("ReplicationState: purge state from non local vms") and other
scenarios where state and disk state are inconsistent with regard to
the last_sync snapshot.
AFAICT, all existing callers intending to remove all snapshots use
last_sync=1 so chaning the behavior for other (non-zero) values should
be fine.
Fiona Ebner [Mon, 13 Jun 2022 10:29:58 +0000 (12:29 +0200)]
replication: also consider storages from replication state upon removal
This prevents left-over volume(s) in the following situation:
1. replication with volumes on different storages A and B
2. remove all volumes on storage B from the guest configuration
3. schedule full removal before the next normal replication runs
Fiona Ebner [Mon, 13 Jun 2022 10:29:57 +0000 (12:29 +0200)]
replication: rename last_snapshots to local_snapshots
because prepare() was changed in 8d1cd44 ("partially fix #3111:
replication: be less picky when selecting incremental base") to return
all local snapshots.
Dominik Csapak [Fri, 3 Jun 2022 07:16:30 +0000 (09:16 +0200)]
ReplicationState: deterministically order replication jobs
if we have multiple jobs for the same vmid with the same schedule,
the last_sync, next_sync and vmid will always be the same, so the order
depends on the order of the $jobs hash (which is random; thanks perl)
to have a fixed order, take the jobid also into consideration
Dominik Csapak [Fri, 3 Jun 2022 07:16:29 +0000 (09:16 +0200)]
ReplicationState: purge state from non local vms
when running replication, we don't want to keep replication states for
non-local vms. Normally this would not be a problem, since on migration,
we transfer the states anyway, but when the ha-manager steals a vm, it
cannot do that. In that case, having an old state lying around is
harmful, since the code does not expect the state to be out-of-sync
with the actual snapshots on disk.
One such problem is the following:
Replicate vm 100 from node A to node B and C, and activate HA. When node
A dies, it will be relocated to e.g. node B and start replicate from
there. If node B now had an old state lying around for it's sync to node
C, it might delete the common base snapshots of B and C and cannot sync
again.
Deleting the state for all non local guests fixes that issue, since it
always starts fresh, and the potentially existing old state cannot be
valid anyway since we just relocated the vm here (from a dead node).
vzdump: schema: add 'notes-template' and 'protected' properties
In command_line(), notes are printed, quoted, but otherwise as is,
which is a bit ugly for multi-line notes. But it is part of the
commandline, so print it.
print snapshot tree: reduce indentation delta per level
previous:
> `-> foo 2021-05-28 12:59:36 no-description
> `-> bar 2021-06-18 12:44:48 no-description
> `-> current You are here!
now:
> `-> foo 2021-05-28 12:59:36 no-description
> `-> bar 2021-06-18 12:44:48 no-description
> `-> current You are here!
So requires less space, allowing deeper snapshot trees to still be
displayed nicely, and looks even better while doing that - the latter
may be subjective though.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Fabian Ebner [Thu, 13 Jan 2022 11:04:01 +0000 (12:04 +0100)]
config: activate affected storages for snapshot operations
For snapshot creation, the storage for the vmstate file is activated
via vdisk_alloc when the state file is created.
Do not activate the volumes themselves, as that has unnecessary side
effects (e.g. waiting for zvol device link for ZFS, mapping the volume
for RBD). If a storage can only do snapshot operations on a volume
that has been activated, it needs to activate the volume itself.
The actual implementation will be in the plugins to be able to skip
CD ROM drives and bind-mounts, etc.