]> git.proxmox.com Git - mirror_ubuntu-bionic-kernel.git/commit
libceph: fix PG split vs OSD (re)connect race
authorIlya Dryomov <idryomov@gmail.com>
Tue, 20 Aug 2019 14:40:33 +0000 (16:40 +0200)
committerKhalid Elmously <khalid.elmously@canonical.com>
Wed, 4 Sep 2019 20:23:26 +0000 (16:23 -0400)
commit972d762f5ac1e92f58f30c82f4a82a10f9199e01
tree26e4fbe3b25ea4df6038e89ab92c8cee1c34cb58
parentb0cd144e198a3555b1a9ed70ea9faca3e7c0f957
libceph: fix PG split vs OSD (re)connect race

BugLink: https://bugs.launchpad.net/bugs/1842114
commit a561372405cf6bc6f14239b3a9e57bb39f2788b0 upstream.

We can't rely on ->peer_features in calc_target() because it may be
called both when the OSD session is established and open and when it's
not.  ->peer_features is not valid unless the OSD session is open.  If
this happens on a PG split (pg_num increase), that could mean we don't
resend a request that should have been resent, hanging the client
indefinitely.

In userspace this was fixed by looking at require_osd_release and
get_xinfo[osd].features fields of the osdmap.  However these fields
belong to the OSD section of the osdmap, which the kernel doesn't
decode (only the client section is decoded).

Instead, let's drop this feature check.  It effectively checks for
luminous, so only pre-luminous OSDs would be affected in that on a PG
split the kernel might resend a request that should not have been
resent.  Duplicates can occur in other scenarios, so both sides should
already be prepared for them: see dup/replay logic on the OSD side and
retry_attempt check on the client side.

Cc: stable@vger.kernel.org
Fixes: 7de030d6b10a ("libceph: resend on PG splits if OSD has RESEND_ON_SPLIT")
Link: https://tracker.ceph.com/issues/41162
Reported-by: Jerry Lee <leisurelysw24@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Jerry Lee <leisurelysw24@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
net/ceph/osd_client.c