Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boolean attributes should not be special in xmlMode #1805

Closed
rau8344 opened this issue Apr 10, 2021 · 2 comments · Fixed by #1903 · 4 remaining pull requests
Closed

Boolean attributes should not be special in xmlMode #1805

rau8344 opened this issue Apr 10, 2021 · 2 comments · Fixed by #1903 · 4 remaining pull requests

Comments

@rau8344
Copy link

rau8344 commented Apr 10, 2021

For HTML boolean attributes cheerio's attr() method behavior is both different from jquery and undocumented:

  • jquery documented behavior is to get the actual value of attributes (except for values matching the attribute name in version 1.5 and earlier, which gets true)
    • This seems to match actual behavior, as per the following unit test, run against jquery@3.4.1, jquery@1.6, and jquery@1.5
  • cheerio documented behavior refers to jquery documentation, without mentioning any edge cases attributes.
    • Actual behavior gets attribute name for boolean attributes, as per the following unit test using xmlMode.
    it("jquery reads boolean attributes as-is", () => {
        const text = `<Item name="a" checked="a" disabled="a" required="a" selected="a" />`;
        const xmlJ = jQuery(jQuery.parseXML(text));
        const rootJ = xmlJ.children();
        expect(rootJ.attr("name")).toBe("a");
        expect(rootJ.attr("checked")).toBe("a");
        expect(rootJ.attr("disabled")).toBe("a");
        expect(rootJ.attr("selected")).toBe("a");
        expect(rootJ.attr("required")).toBe("a");
    });

    it("cheerio gets the name of defined boolean attributes instead of value", () => {
        const options = { xmlMode: true };
        const text = `<Item name="a" checked="a" disabled="a" required="a" selected="a" />`;
        const xml = cheerio.load(text, options);
        const root = xml.root().children();
        expect(root.attr("name")).toBe("a");
        expect(root.attr("checked")).toBe("checked");
        expect(root.attr("disabled")).toBe("disabled");
        expect(root.attr("selected")).toBe("selected");
        expect(root.attr("required")).toBe("required");
    });

This behavior was apparently intentional, and commented in the source for getAttr() https://cheerio.js.org/lib_api_attributes.js.html, and made for made for #570. Stated reason was "jquery behaviour is to return the attrribute name when present", but I cannot find any version of jquery where that is the actual behavior. This may have related behavior for attribute setting, though I have not tested that.

Suggest one of the following:

  1. (Breaking change) Change attr() to always match documented jquery behavior.
  2. Change attr() to conditionally match documented jquery behavior if certain options are provided to cheerio.load() (e.g. for xmlMode)
  3. Change cheerio documentation to match the actual behavior.
  4. Provide additional API methods/overloads to allow both old and new behavior. (Could be confusing; would definitely require documentation)
@fb55
Copy link
Member

fb55 commented Apr 10, 2021

Very interesting case. It seems like this only applies to XML inside of jQuery — HTML actually shows the behavior described in #570.

@rau8344 My proposed solution would be to change the behavior in XML mode. How do you feel about this?

@rau8344
Copy link
Author

rau8344 commented Apr 10, 2021

@fb55 That makes sense to me.

@fb55 fb55 changed the title Boolean attribute behavior different from jquery and undocumented Boolean attributes should not be special in xmlMode Apr 14, 2021
@fb55 fb55 closed this as completed in #1903 Jun 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment